packagist: Reimplement lister using new Lister API
The previous implementation was generating tasks for a non implemented Packagist loader.
The new implementation extracts source repository URL, VCS type and last update date for each package referenced by Packagist and send those info to the scheduler.
Packages metadata are retrieved using Packagist API endpoints whose
responses are served from static files, which are guaranteed to be
efficient on the Packagist side (no dymamic queries).
Furthermore, subsequent listing will send the If-Modified-Since
HTTP
header to only retrieve packages metadata updated since the previous
listing operation in order to save bandwidth and return only origins
which might have new released versions.
I tested intensively the lister yersteday and it worked without any
issues each time I executed it. First execution took around 90 minutes
and listed 286510 origins with three different visit types: git, hg and
svn. Subsequent calls took less time thanks to the If-Mofified-Since
HTTP header use and only returned packages modified since last listing.
Closes #2991 (closed)
Migrated from D4990 (view on Phabricator)