cassandra: Use concurrent queries in *_missing() instead of naive grouping

Some references in the commit message have been migrated:

T3577#72791 is now swh/infra/sysadm-environment#3577 (closed)

Build is green

Patch application report for D6885 (id=24967)

Rebasing onto 259bf6fe...

Current branch diff-target is up to date.

Changes applied before test

commit 4a24505049d5c34c264d2b27e5feb24719b9e674
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Jan 6 12:41:45 2022 +0100

    cassandra: Use concurrent queries in *_missing() instead of naive grouping
    
    Instead of grouping ids in queries in arbitrary batches (which forces
    the server node to coordinate with other nodes to complete the query),
    this sends queries with one id each, directly to the right node.
    
    This is the 'concurrent' algorithm from https://forge.softwareheritage.org/swh/infra/sysadm-environment#3577
    which gives a >=2x speed-up on directories, and a >=8x speed-up on revisions.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1517/ for more details.

mentioned in merge request !756 (closed)

mentioned in merge request !727 (closed)

Merge request was accepted

Merge request was merged

closed

approved this merge request

cassandra: Use concurrent queries in *_missing() instead of naive grouping

Merge request reports

Activity

Patch application report for D6885 (id=24967)

Changes applied before test