Skip to content

lister/loader: Ingest archived artifacts from cran mirror

keyword is archived, as of now, we only ingest the main one.

2 sides of that coin (which can be done independently and in any order we so choose):

Lister

algo:

  • drop the R cran script
  • parse the listing page instead (as in simple_lister, check lister cgit's way of doing it) [1]
  • for each package found there, send the origin url [2] to the loader (as recurring task)

schema adaptations:

  • make the tasks outputed by the lister as recurring (currently oneshot)
  • Adapt uid field to be the origin_url's value

migration plan:

  • truncate cran_repo table
  • trigger back a full listing

Loader

algo:

Related to #2029 (closed)


Migrated from T2241 (view on Phabricator)

Edited by Phabricator Migration user
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information