Skip to content
Snippets Groups Projects
Commit bc30e8bc authored by Stefano Zacchiroli's avatar Stefano Zacchiroli
Browse files

doc: add documentation of contextual information for persistent IDs

Closes T1041
parent 448eafa0
No related branches found
No related tags found
No related merge requests found
......@@ -47,8 +47,8 @@ entry point of the grammar:
| "cnt" (* content *)
;
<object_id> ::= 40 * <hex_digit> ; (* intrinsic object id, as hex-encoded SHA1 *)
<hex_digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
| "a" | "b" | "c" | "d" | "e" | "f" ;
<dec_digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
<hex_digit> ::= <dec_digit> | "a" | "b" | "c" | "d" | "e" | "f" ;
Semantics
......@@ -143,3 +143,51 @@ and will lead to the browsing page of the corresponding object, like this:
* `<https://archive.softwareheritage.org/browse/swh:1:rev:309cf2674ee7a0749978cf8265ab91a60aea0f7d>`_
* `<https://archive.softwareheritage.org/browse/swh:1:rel:22ece559cc7cc2364edc5e5593d63ae8bd229f9f>`_
* `<https://archive.softwareheritage.org/browse/swh:1:snp:c7c108084bc0bf3d81436bf980b46e98bd338453>`_
Contextual information
======================
It is often useful to complement persistent identifiers with **contextual
information** about where the identified object has been found as well as which
specific parts of it are of interest. To that end it is possible, via a
dedicated syntax, to extend persistent identifiers with the following pieces of
information:
* the **software origin** where an object has been found/observed
* the **line number(s)** of interest, usually within a content object
Syntax
------
The full-syntax to complement identifiers with contextual information is given
by the ``<identifier_with_context>`` entry point of the grammar:
.. code-block:: bnf
<identifier_with_context> ::= <identifier> [<lines_ctxt>] [<origin_ctxt>]
<lines_ctxt> ::= ";" "lines" "=" <line_number> ["-" <line_number>]
<origin_ctxt> ::= ";" "origin" "=" <url>
<line_number> ::= <dec_digit> +
<url> ::= (* RFC 3986 compliant URLs *)
Semantics
---------
";" is used a separator between persistent identifiers and additional optional
contextual information. Each piece of contextual information is specified as a
key/value pair, using "=" as a separator.
The following piece of contextual information are supported:
* line numbers: it is possible to specify a single line number or a line range,
separating two numbers with "-". Note that line numbers are purely indicative
and are not meant to be stable, as in some degenerate cases (e.g., text files
which mix different types of line terminators) it is impossible to resolve
them unambiguously.
* software origin: where a given object has been found or observed in the wild,
as the URI that was used by Software Heritage to ingest the object into the
archive
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment