Persistent identifiers (PIDs): add a way to describe Merkle DAG paths
[updated on the basis of #1241 (closed) below]
The goal of this task is to define the canonical way of describing //paths// in the SWH Merkle DAG. This is formally a description of how one goes from a given node in the Merkle DAG, that we call an //anchor// to another node following the edges in the DAG, the //endpoint//.
We observe that when the anchor denotes a revision (and most often when it's a release), it's trivial to find in the DAG the root directory of the source code, and we only need the file path to identify the content we are interested in. When it's a snapshot, there is a default root directory to point to.
Hence we have concluded that for the vast majority of use cases it is enough to extend the syntax and semantics of our SWH-IDs with the following optional elements:
- anchor : the swh-id of the //anchor node in the DAG//: this can be a snapshot, a release, a revision, or a directory
- path : the full path from the root directory of the anchor to the //endpoint// object, that can be a directory or a file content
- visit : the swh-id of the //snapshot// in whose context the anchor must be shown
Here is a full example:
swh:1:cnt:4d99d2d18326621ccdd70f5ea66c2e2ac236ad8b; anchor=swh:1:rev:2db189928c94d62a3b4757b3eec68f0a4d4113f0; path=/Examples/SimpleFarm/simplefarm.ml; visit=swh:1:snp:d7f1b9eb7ccb596c2622c4780febaa02549830f9; origin=https://gitorious.org/ocamlp3l/ocamlp3l_cvs.git; lines=12-23
We checked with @anlambert that all the pieces of information needed to generate such optional elements for the swh-ids are already available in the WebApp view, so it will be straightforward to provide the final user with this kind of links.
On the receiving end, we also have all the information needed to check that the object and its context match:
- for the //path// part, we just need to follow the path from the anchor and check that the endpoint has the declared swh-id
- for the //visit// part, we might (if we want) make a request to the swh graph to check that the anchor node is well in the subgraph rooted at the given snapshot
Migrated from T1241 (view on Phabricator)