Add test for origin_visit_get_latest in presence of mismatched id and date orders

Build is green

Patch application report for D6121 (id=22143)

Could not rebase; Attempt merge onto 9f00eb9d...

Updating 9f00eb9d..e291c74b
Fast-forward
 swh/storage/cassandra/cql.py       | 45 ++++++++++++++++++++++++
 swh/storage/cassandra/model.py     |  4 +--
 swh/storage/cassandra/schema.py    |  2 +-
 swh/storage/cassandra/storage.py   | 26 ++++++++++++--
 swh/storage/in_memory.py           | 11 ++++++
 swh/storage/tests/storage_tests.py | 70 ++++++++++++++++++++++++++++++++++++++
 6 files changed, 152 insertions(+), 6 deletions(-)

Changes applied before test

commit e291c74b04b8e7501f4e41ea237591038ff2d9b8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 20:11:51 2021 +0200

    Add test for origin_visit_get_latest in presence of mismatched id and date orders
    
    It was unclear this actually worked; I had to write this test to realize
    the code wasn't buggy.
    
    Also replaced a conditional that is always False (because Cassandra
    always returns results in the order of the clustering key) with an
    assertion, so the code is less confusing.

commit 724a67e06fd6e6c9ed93c28dae79db43239e7fc9
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 18:12:26 2021 +0200

    cassandra: Bump next_visit_id when origin_visit_add is called by a replayer
    
    When called by a replayer, the visit.visit field is set; but
    origin.next_visit_id was never incremented, so on the next loader
    run, the visit id would be 1 even if there is already a visit
    with that id.

commit a3cc0dc7b104bc8b7f05988a7e0e26fae462ac7f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 13:52:17 2021 +0200

    cassandra: Make content_missing query in batches
    
    Instead of calling content_find() for each object, which needs to make
    two queries for each.
    
    Given the latency of Cassandra queries, this should be a significant
    speed-up (possibly up to 100 times faster, as this is the value of
    PARTITION_KEY_RESTRICTION_MAX_SIZE).
    
    This also changes the schema, because CQL does not allow doing `IN`
    queries on compound partition keys.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1366/ for more details.

Merge request was accepted

approved this merge request

rebase

Build is green

Patch application report for D6121 (id=22164)

Could not rebase; Attempt merge onto 7113198f...

Updating 7113198f..8f1cdf65
Fast-forward
 swh/storage/cassandra/cql.py       | 45 ++++++++++++++++++++++++
 swh/storage/cassandra/model.py     |  4 +--
 swh/storage/cassandra/schema.py    |  2 +-
 swh/storage/cassandra/storage.py   | 26 ++++++++++++--
 swh/storage/in_memory.py           | 11 ++++++
 swh/storage/tests/storage_tests.py | 70 ++++++++++++++++++++++++++++++++++++++
 6 files changed, 152 insertions(+), 6 deletions(-)

Changes applied before test

commit 8f1cdf65a1056dac42755e8c70ae38f3d34aa459
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 20:11:51 2021 +0200

    Add test for origin_visit_get_latest in presence of mismatched id and date orders
    
    It was unclear this actually worked; I had to write this test to realize
    the code wasn't buggy.
    
    Also replaced a conditional that is always False (because Cassandra
    always returns results in the order of the clustering key) with an
    assertion, so the code is less confusing.

commit cf880db30bb549ccbdbb2cdd05b61d124ed90be7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 18:12:26 2021 +0200

    cassandra: Bump next_visit_id when origin_visit_add is called by a replayer
    
    When called by a replayer, the visit.visit field is set; but
    origin.next_visit_id was never incremented, so on the next loader
    run, the visit id would be 1 even if there is already a visit
    with that id.

commit 54b5abfb26267ad56a67ad9fa2dd9d5d075e30f0
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 13:52:17 2021 +0200

    cassandra: Make content_missing query in batches
    
    Instead of calling content_find() for each object, which needs to make
    two queries for each.
    
    Given the latency of Cassandra queries, this should be a significant
    speed-up (possibly up to 100 times faster, as this is the value of
    PARTITION_KEY_RESTRICTION_MAX_SIZE).
    
    This also changes the schema, because CQL does not allow doing `IN`
    queries on compound partition keys.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1370/ for more details.

Merge request was merged

closed

Add test for origin_visit_get_latest in presence of mismatched id and date orders

Closed by Phabricator Migration user 3 years ago (Aug 24, 2021 2:14pm UTC) 3 years ago

Activity

Patch application report for D6121 (id=22143)

Changes applied before test

Patch application report for D6121 (id=22164)

Changes applied before test

Add test for origin_visit_get_latest in presence of mismatched id and date orders

Merge request reports

Closed by Phabricator Migration user 3 years ago (Aug 24, 2021 2:14pm UTC) 3 years ago

Activity

Patch application report for D6121 (id=22143)

Changes applied before test

Patch application report for D6121 (id=22164)

Changes applied before test