Prepare an environment to test the ClearlyDefined integration
ClearlyDefined[1] project could help to generate extrinsic metadata of the archive content. The service can be accessed via a rest api[2] but there is a rate limiting in place. To get rid of this limit, a miror can be installed on our infrastructure and keep in sync with a ingestion proxy [3]
An intern will work on this subject. We should provide a VM with enough disk space and memory to create it. a preconfigured postgresql could be useful too,
The disk space needed is estimated to 2To (cf homepage of the github project[3])
Steps:
-
#2890: Onboard tg19999 -
infra/puppet/puppet-swh-site!284: Create db instance on staging for now [4] -
infra/puppet/puppet-swh-site!285, infra/swh-sysadmin-provisioning!48 : Create vm instance with db access -
[2] https://api.clearlydefined.io/api-docs/#/definitions/get_definitions
-
[4] uffizi has some disk limitations in the end
Migrated from T2865 (view on Phabricator)