Skip to content

cassandra: Split author/committer/date/committer_date into individual columns

vlorentz requested to merge no-person-udt into master

Cassandra does not support filtering on individual fields of UDTs, as it considers structures as a single whole value.

However, the infra team needs to filter on author.email and committer.email, hence the need for separate columns.

This commit reads and writes the new split columns, but keeps reading the UDT as a fallback. This will be removed after we are done migrating all rows.

Migration plan:

  1. ALTER TABLE revision
    ADD (
        author_fullname                 blob,
        author_name                     blob,
        author_email                    blob,
        committer_fullname              blob,
        committer_name                  blob,
        committer_email                 blob,
        date_seconds                    bigint,
        date_microseconds               int,
        date_offset_bytes               blob,
        committer_date_seconds          bigint,
        committer_date_microseconds     int,
        committer_date_offset_bytes     blob
    );
    ALTER TABLE release
    ADD (
        author_fullname                 blob,
        author_name                     blob,
        author_email                    blob,
        date_seconds                    bigint,
        date_microseconds               int,
        date_offset_bytes               blob
    );
  2. update Python code and restart

  3. run a replayer on revision and release objects without a filtering proxy, in order to write the new columns

Edited by vlorentz

Merge request reports

Loading