Skip to content

packagist: Reimplement lister using new Lister API

The previous implementation was generating tasks for a non implemented Packagist loader.

The new implementation extracts source repository URL, VCS type and last update date for each package referenced by Packagist and send those info to the scheduler.

Packages metadata are retrieved using Packagist API endpoints whose responses are served from static files, which are guaranteed to be efficient on the Packagist side (no dymamic queries). Furthermore, subsequent listing will send the If-Modified-Since HTTP header to only retrieve packages metadata updated since the previous listing operation in order to save bandwidth and return only origins which might have new released versions.

I tested intensively the lister yersteday and it worked without any issues each time I executed it. First execution took around 90 minutes and listed 286510 origins with three different visit types: git, hg and svn. Subsequent calls took less time thanks to the If-Mofified-Since HTTP header use and only returned packages modified since last listing.

Closes #2991 (closed)


Migrated from D4990 (view on Phabricator)

Merge request reports