Skip to content
Snippets Groups Projects

cassandra: Split author/committer/date/committer_date into individual columns

Open vlorentz requested to merge no-person-udt into master
  1. Mar 18, 2025
  2. Mar 17, 2025
  3. Dec 23, 2024
    • vlorentz's avatar
      Change type of minimal_revision from hg to git · 51d8725c
      vlorentz authored
      test_extid_add_hg expects all hg revisions to have a 'node' extra header,
      which minimal_revision does not have.
      51d8725c
    • vlorentz's avatar
      Fix rebase · 75d167a2
      vlorentz authored
      75d167a2
    • vlorentz's avatar
      Flatten dates too + add tests for nulls · 49e06138
      vlorentz authored and vlorentz's avatar vlorentz committed
      49e06138
    • vlorentz's avatar
      Fix docstring · f5aaa1f5
      vlorentz authored and vlorentz's avatar vlorentz committed
      f5aaa1f5
    • vlorentz's avatar
      cassandra: Split author/committer into individual columns · f61d649c
      vlorentz authored and vlorentz's avatar vlorentz committed
      Cassandra does not support filtering on individual fields of UDTs, as it considers
      structures as a single whole value.
      
      However, the infra team needs to filter on author.email and committer.email, hence the need
      for separate columns.
      
      This commit reads and writes the new split columns, but keeps reading the UDT as
      a fallback. This will be removed after we are done migrating all rows.
      
      Migration plan:
      
      1.
         ```
         ALTER TABLE revision
         ADD (
             author_fullname                 blob,
             author_name                     blob,
             author_email                    blob,
             committer_fullname              blob,
             committer_name                  blob,
             committer_email                 blob
         );
         ALTER TABLE release
         ADD (
             author_fullname                 blob,
             author_name                     blob,
             author_email                    blob
         );
         ```
      
      2. update Python code and restart
      
      3. run a replayer on `revision` and `release` objects without a filtering proxy,
         in order to write the new columns
      f61d649c
Loading