Skip to content

loaders: Move the proxy storage filter after the buffer proxy

in their pipeline configuration

context: swh-loader-core!363 (closed) for some DVCS loaders now send one object at a time to the storage.

So this will allow batching calls to the *_missing endpoints (for dvcs loaders e.g. git loader).

This slighly impacts the package loaders but this should tend towards null.

Prior to this we filtered unknown objects and kept a buffer of those unknown objects to flush to the storage given a threshold hit.

Now, we will buffer all objects and then filter on said buffer of objects. So we may increase calls to the *_missing endpoints.

Related to swh-loader-core!363 (closed) Related to swh/infra/puppet/puppet-swh-site!228 (closed) Related to swh-loader-git#2373 (closed)

Test Plan

docker rebuild which include the latest loader-git 0.11.0 and this setup patch.

time doco exec swh-loader swh --log-level DEBUG loader run git git://git.savannah.gnu.org/guix

loader does its job without a gazillion small calls to the *_missing endpoints.


Migrated from D3988 (view on Phabricator)

Merge request reports