cassandra: Make content_missing query in batches

Build is green

Patch application report for D6118 (id=22137)

Rebasing onto 9f00eb9d...

Current branch diff-target is up to date.

Changes applied before test

commit 0f89a9dc7c86eec7dbf2c75180dfd008d6881196
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 13:52:17 2021 +0200

    cassandra: Make content_missing query in batches
    
    Instead of calling content_find() for each object, which needs to make
    two queries for each.
    
    Given the latency of Cassandra queries, this should be a significant
    speed-up (possibly up to 100 times faster, as this is the value of
    PARTITION_KEY_RESTRICTION_MAX_SIZE).

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1362/ for more details.

mention schema change in commit

Build was aborted

Patch application report for D6118 (id=22138)

Rebasing onto 9f00eb9d...

Current branch diff-target is up to date.

Changes applied before test

commit a3cc0dc7b104bc8b7f05988a7e0e26fae462ac7f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 13:52:17 2021 +0200

    cassandra: Make content_missing query in batches
    
    Instead of calling content_find() for each object, which needs to make
    two queries for each.
    
    Given the latency of Cassandra queries, this should be a significant
    speed-up (possibly up to 100 times faster, as this is the value of
    PARTITION_KEY_RESTRICTION_MAX_SIZE).
    
    This also changes the schema, because CQL does not allow doing `IN`
    queries on compound partition keys.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1363/ See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1363/console

Build is green

Patch application report for D6118 (id=22138)

Rebasing onto 9f00eb9d...

Current branch diff-target is up to date.

Changes applied before test

commit a3cc0dc7b104bc8b7f05988a7e0e26fae462ac7f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 13:52:17 2021 +0200

    cassandra: Make content_missing query in batches
    
    Instead of calling content_find() for each object, which needs to make
    two queries for each.
    
    Given the latency of Cassandra queries, this should be a significant
    speed-up (possibly up to 100 times faster, as this is the value of
    PARTITION_KEY_RESTRICTION_MAX_SIZE).
    
    This also changes the schema, because CQL does not allow doing `IN`
    queries on compound partition keys.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1364/ for more details.

The performance are ok now for the read part with a batch size of 1000 for content, directory and revision.

Merge request was accepted

approved this merge request

rebase

Build has FAILED

Patch application report for D6118 (id=22162)

Rebasing onto 7113198f...

Current branch diff-target is up to date.

Changes applied before test

commit 54b5abfb26267ad56a67ad9fa2dd9d5d075e30f0
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 20 13:52:17 2021 +0200

    cassandra: Make content_missing query in batches
    
    Instead of calling content_find() for each object, which needs to make
    two queries for each.
    
    Given the latency of Cassandra queries, this should be a significant
    speed-up (possibly up to 100 times faster, as this is the value of
    PARTITION_KEY_RESTRICTION_MAX_SIZE).
    
    This also changes the schema, because CQL does not allow doing `IN`
    queries on compound partition keys.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1368/ See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1368/console

Merge request was merged

closed

cassandra: Make content_missing query in batches

Test Plan

Closed by Phabricator Migration user 3 years ago (Aug 24, 2021 2:14pm UTC) 3 years ago

Activity

Patch application report for D6118 (id=22137)

Changes applied before test

Patch application report for D6118 (id=22138)

Changes applied before test

Patch application report for D6118 (id=22138)

Changes applied before test

Patch application report for D6118 (id=22162)

Changes applied before test

cassandra: Make content_missing query in batches

Test Plan

Merge request reports

Closed by Phabricator Migration user 3 years ago (Aug 24, 2021 2:14pm UTC) 3 years ago

Activity

Patch application report for D6118 (id=22137)

Changes applied before test

Patch application report for D6118 (id=22138)

Changes applied before test

Patch application report for D6118 (id=22138)

Changes applied before test

Patch application report for D6118 (id=22162)

Changes applied before test