Skip to content

loaders: Move the proxy storage filter after the buffer proxy

in their pipeline configuration

context: D3976 for some DVCS loaders now send one object at a time to the storage.

So this will allow batching calls to the *_missing endpoints (for dvcs loaders e.g. git loader).

This slighly impacts the package loaders but this should tend towards null.

Prior to this we filtered unknown objects and kept a buffer of those unknown objects to flush to the storage given a threshold hit.

Now, we will buffer all objects and then filter on said buffer of objects. So we may increase calls to the *_missing endpoints.

Related to D3976 Related to T2373

Test Plan

  • run on staging node with this and everything runs fine (git, npm, pypi... see T2373#49135)

  • octocatalog

bin/octocatalog-diff --octocatalog-diff-args --no-truncate-details --to staging worker01
Found host worker01.softwareheritage.org
Cloning into '/tmp/swh-ocd.lTqt9H4A/environments/production/data/private'...
done.
Cloning into '/tmp/swh-ocd.lTqt9H4A/environments/staging/data/private'...
done.
*** Running octocatalog-diff on host worker01.softwareheritage.org
I, [2020-09-18T14:06:13.917723 #6126]  INFO -- : Catalogs compiled for worker01.softwareheritage.org
I, [2020-09-18T14:06:14.899756 #6126]  INFO -- : Diffs computed for worker01.softwareheritage.org
diff origin/production/worker01.softwareheritage.org current/worker01.softwareheritage.org
*******************************************
  File[/etc/softwareheritage/loader_archive.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_cran.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_debian.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_deposit.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_git.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_mercurial.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_nixguix.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_npm.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_pypi.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
  File[/etc/softwareheritage/loader_svn.yml] =>
   parameters =>
     content =>
      @@ -4,5 +4,4 @@
         steps:
         - cls: retry
      -  - cls: filter
         - cls: buffer
           min_batch_size:
      @@ -12,4 +11,5 @@
             revision: 1000
             release: 1000
      +  - cls: filter
         - cls: remote
           args:
*******************************************
*** End octocatalog-diff on worker01.softwareheritage.org

Migrated from D3986 (view on Phabricator)

Merge request reports

Loading