Skip to content

cassandra/raw_extrinsic_metadata: Stop inserting null entries

We do insert null values in cassandra [1] as per analysis [2].

This stop those null insertions. Instead of storing null values, this stores empty values ("" for string [e.g. directory, revision, release, snapshot, ...], b"" for bytes [e.g. path]).

That keeps the read objects compliant with the RawExtrinsicMetadata model object by replacing those empty (byte) strings with None when it makes sense.

  • I've not adapted the main api calls as I did not find where to actually evolve those tests to ensure cassandra.

[2] swh/infra/sysadm-environment#5287 (comment 168469)

[1]

2024-03-20 14:35:18 softwareheritage@albertina:5432 λ select target from raw_extrinsic_metadata where visit is null limit 1;
+----------------------------------------------------+
|                       target                       |
+----------------------------------------------------+
| swh:1:dir:d6599a075021f7821e720fb88dd8f263437a67b4 |
+----------------------------------------------------+
(1 row)

Time: 43.009 ms
2024-03-20 14:37:00 softwareheritage@albertina:5432 λ select target from raw_extrinsic_metadata where origin is null limit 1;
+----------------------------------------------------+
|                       target                       |
+----------------------------------------------------+
| swh:1:ori:79fa5aae8de39b1b715b8697836f798b01b04cdb |
+----------------------------------------------------+
(1 row)

Time: 7.382 ms
2024-03-20 14:37:04 softwareheritage@albertina:5432 λ select target from raw_extrinsic_metadata where snapshot is null limit 1;
+----------------------------------------------------+
|                       target                       |
+----------------------------------------------------+
| swh:1:dir:d6599a075021f7821e720fb88dd8f263437a67b4 |
+----------------------------------------------------+
(1 row)

Time: 6.172 ms
2024-03-20 14:37:16 softwareheritage@albertina:5432 λ select target from raw_extrinsic_metadata where release is null limit 1;
+----------------------------------------------------+
|                       target                       |
+----------------------------------------------------+
| swh:1:dir:d6599a075021f7821e720fb88dd8f263437a67b4 |
+----------------------------------------------------+
(1 row)

Time: 5.558 ms
2024-03-20 14:37:26 softwareheritage@albertina:5432 λ select target from raw_extrinsic_metadata where revision is null limit 1;
+----------------------------------------------------+
|                       target                       |
+----------------------------------------------------+
| swh:1:ori:79fa5aae8de39b1b715b8697836f798b01b04cdb |
+----------------------------------------------------+
(1 row)

Time: 5.410 ms
2024-03-20 14:37:30 softwareheritage@albertina:5432 λ select target from raw_extrinsic_metadata where path is null limit 1;
+----------------------------------------------------+
|                       target                       |
+----------------------------------------------------+
| swh:1:dir:d6599a075021f7821e720fb88dd8f263437a67b4 |
+----------------------------------------------------+
(1 row)

Time: 9.282 ms
2024-03-20 14:37:33 softwareheritage@albertina:5432 λ select target from raw_extrinsic_metadata where directory is null limit 1;
+----------------------------------------------------+
|                       target                       |
+----------------------------------------------------+
| swh:1:dir:d6599a075021f7821e720fb88dd8f263437a67b4 |
+----------------------------------------------------+
(1 row)

Time: 6.879 ms

Refs. swh/infra/sysadm-environment#5287 (closed)

Edited by Antoine R. Dumont

Merge request reports