Skip to content
Snippets Groups Projects

Add counting storage proxy

It will be used in the Cassandra experiment.

Currently we use the built-in counters of the Cassandra backend; but in addition to being inaccurate, they seem to be a bottleneck.

This proxy will be a lightweight solution for counting object insertion, without needing to run Kafka on the test cluster.


Migrated from D6149 (view on Phabricator)

Merge request reports

Closed by Phabricator Migration userPhabricator Migration user 3 years ago (Aug 27, 2021 11:59am UTC)

Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Build has FAILED

    Patch application report for D6149 (id=22252)

    Could not rebase; Attempt merge onto b110d1b6...

    Merge made by the 'recursive' strategy.
     swh/storage/__init__.py             |  5 ++-
     swh/storage/cassandra/cql.py        | 88 ++++++++++++++++++++++++++++++++++---
     swh/storage/cassandra/storage.py    | 24 ++++++++--
     swh/storage/in_memory.py            |  1 +
     swh/storage/proxies/counter.py      | 66 ++++++++++++++++++++++++++++
     swh/storage/tests/test_cassandra.py |  7 +--
     swh/storage/tests/test_counter.py   | 63 ++++++++++++++++++++++++++
     7 files changed, 238 insertions(+), 16 deletions(-)
     create mode 100644 swh/storage/proxies/counter.py
     create mode 100644 swh/storage/tests/test_counter.py
    Changes applied before test
    commit d14d3815aed40d765d6939d90396299c96a9a727
    Merge: b110d1b6 1875046f
    Author: Jenkins user <jenkins@localhost>
    Date:   Fri Aug 27 09:32:42 2021 +0000
    
        Merge branch 'diff-target' into HEAD
    
    commit 1875046f31eaa61e3f999e351f86dfba66b58680
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Fri Aug 27 11:32:03 2021 +0200
    
        Add counting storage proxy
        
        It will be used in the Cassandra experiment.
        
        Currently we use the built-in counters of the Cassandra backend; but in
        addition to being inaccurate, they seem to be a bottleneck.
        
        This proxy will be a lightweight solution for counting object insertion,
        without needing to run Kafka on the test cluster.
    
    commit 39c7212deb5b32d2486b39d1498b6636f3c86893
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Thu Aug 26 12:20:26 2021 +0200
    
        Update test
    
    commit 459bc9d6656f3764120682218d87af73e881ec4b
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Thu Aug 26 11:45:22 2021 +0200
    
        Fix in-mem
    
    commit 6b27a722815e25c4f64ff3f137328728fbcb7518
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Thu Aug 26 11:08:15 2021 +0200
    
        cassandra: Add option to select (hopefully) more efficient batch insertion algos
        
        This adds a new config option for the cassandra backend,
        'directory_entries_insert_algo', with three possible values:
        
        * 'one-per-one' is the default, and preserves the current naive behavior
        * 'concurrent' and 'batch' are attempts at being more efficient

    Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1376/ See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1376/console

  • Author Maintainer

    add missing dep

  • Build is green

    Patch application report for D6149 (id=22253)

    Could not rebase; Attempt merge onto b110d1b6...

    Merge made by the 'recursive' strategy.
     requirements-swh.txt                |  1 +
     swh/storage/__init__.py             |  5 ++-
     swh/storage/cassandra/cql.py        | 88 ++++++++++++++++++++++++++++++++++---
     swh/storage/cassandra/storage.py    | 24 ++++++++--
     swh/storage/in_memory.py            |  1 +
     swh/storage/proxies/counter.py      | 66 ++++++++++++++++++++++++++++
     swh/storage/tests/test_cassandra.py |  7 +--
     swh/storage/tests/test_counter.py   | 63 ++++++++++++++++++++++++++
     8 files changed, 239 insertions(+), 16 deletions(-)
     create mode 100644 swh/storage/proxies/counter.py
     create mode 100644 swh/storage/tests/test_counter.py
    Changes applied before test
    commit 3f67bd62b7a45363aef6d80c608603b0a87c801b
    Merge: b110d1b6 b10788d3
    Author: Jenkins user <jenkins@localhost>
    Date:   Fri Aug 27 09:44:11 2021 +0000
    
        Merge branch 'diff-target' into HEAD
    
    commit b10788d3789fa1010d45ac57f79a16c8c3627502
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Fri Aug 27 11:32:03 2021 +0200
    
        Add counting storage proxy
        
        It will be used in the Cassandra experiment.
        
        Currently we use the built-in counters of the Cassandra backend; but in
        addition to being inaccurate, they seem to be a bottleneck.
        
        This proxy will be a lightweight solution for counting object insertion,
        without needing to run Kafka on the test cluster.
    
    commit 39c7212deb5b32d2486b39d1498b6636f3c86893
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Thu Aug 26 12:20:26 2021 +0200
    
        Update test
    
    commit 459bc9d6656f3764120682218d87af73e881ec4b
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Thu Aug 26 11:45:22 2021 +0200
    
        Fix in-mem
    
    commit 6b27a722815e25c4f64ff3f137328728fbcb7518
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Thu Aug 26 11:08:15 2021 +0200
    
        cassandra: Add option to select (hopefully) more efficient batch insertion algos
        
        This adds a new config option for the cassandra backend,
        'directory_entries_insert_algo', with three possible values:
        
        * 'one-per-one' is the default, and preserves the current naive behavior
        * 'concurrent' and 'batch' are attempts at being more efficient

    See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1377/ for more details.

  • Antoine R. Dumont mentioned in merge request !709 (closed)

    mentioned in merge request !709 (closed)

  • Antoine R. Dumont mentioned in merge request !707 (closed)

    mentioned in merge request !707 (closed)

  • Merge request was accepted

  • Antoine R. Dumont approved this merge request

    approved this merge request

  • couple of typos inline.

  • Author Maintainer

    rebase + fix typos

  • Build is green

    Patch application report for D6149 (id=22265)

    Rebasing onto b110d1b6...

    First, rewinding head to replay your work on top of it...
    Applying: Add counting storage proxy
    Changes applied before test
    commit 2bf29b23ecdfad28345476337eec695aabf26c85
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Fri Aug 27 11:32:03 2021 +0200
    
        Add counting storage proxy
        
        It will be used in the Cassandra experiment.
        
        Currently we use the built-in counters of the Cassandra backend; but in
        addition to being inaccurate, they seem to be a bottleneck.
        
        This proxy will be a lightweight solution for counting object insertion,
        without needing to run Kafka on the test cluster.

    See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1383/ for more details.

  • Author Maintainer

    rebase

  • Build is green

    Patch application report for D6149 (id=22269)

    Rebasing onto b110d1b6...

    Current branch diff-target is up to date.
    Changes applied before test
    commit 47a6919fee499dd51fb0098099e895088a1a7c25
    Author: Valentin Lorentz <vlorentz@softwareheritage.org>
    Date:   Fri Aug 27 11:32:03 2021 +0200
    
        Add counting storage proxy
        
        It will be used in the Cassandra experiment.
        
        Currently we use the built-in counters of the Cassandra backend; but in
        addition to being inaccurate, they seem to be a bottleneck.
        
        This proxy will be a lightweight solution for counting object insertion,
        without needing to run Kafka on the test cluster.

    See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1385/ for more details.

  • Author Maintainer

    Merge request was merged

  • closed

Please register or sign in to reply
Loading