add support for reverse lookup from swh:1:ori:... PIDs to origin URLs
Now that we have defined an intrinsic PID schema for origins and support for it in both swh identify
and swh-graph (as graph roots), we need a way to reverse lookup from origin PIDs to origin URLs.
As I understand it that means:
-
adding a column to the origin table for the origin checksum (either as a PID or, more consistently with the rest of the SQL schema, as a SHA1 checksum) -
patch the storage functions that create new origins to also fill the SHA1 column -
add a storage function to perform the SHA1→URL lookup
For the transition we will need to:
- initially mark the SHA1 column as NULL-able
- deploy in production a storage version that fills the SHA1 for //new// origins
- perform a one off conversion of all old origins that have NULL SHA1s
- mark the SHA1 column as non NULL-able (and add a B-tree index on it)
Migrated from T2045 (view on Phabricator)