Intrinsic identifiers for origins
We currently use an incrementing integer to uniquely identify origins. This does not work well with a distributed database (eg. Cassandra), and is not an intrinsic identifier like most of the archive.
So we should define a new identifier for origins. Current options:
- A 2-tuple:
(type, url)
. Pros: useful information can be derived that identifier without an API request. - A hash of the type and url. Pros: fixed-size and compact
Migrated from T1731 (view on Phabricator)