Skip to content

Add save-bulk lister to check origins prior their insertion in database

Antoine Lambert requested to merge anlambert/swh-lister:bulk-lister into master

This new and special lister enables to verify a list of origins to archive provided by a user (for instance through the Web API).

The list of origins to check is retrieved through the use of a paginated HTTP endpoint.

Its purpose is to avoid polluting the scheduler database with origins that cannot be loaded into the archive.

Each origin is identified by an URL and a visit type. For a given visit type the lister is checking if the origin URL can be found and if the visit type is valid.

The supported visit types are those for VCS (bzr, cvs, hg, git and svn) plus the one for loading a tarball content into the archive.

Accepted origins are inserted or upserted in the scheduler database.

Rejected origins are stored in the lister state.

Related to #4709 (closed)

Edited by Antoine Lambert

Merge request reports