Skip to content
Snippets Groups Projects
Commit 5e339b44 authored by David Douard's avatar David Douard
Browse files

Add a glossary and begin to use it in the getting-started page

parent 4801d9c1
No related branches found
No related tags found
1 merge request!11Add the beginning of a top-level architecture document
......@@ -119,7 +119,7 @@ Step 3 --- set up storage
Then you will need a local storage service that will archive and serve source
code artifacts via a REST API. The Software Heritage storage layer comes in two
parts: a content-addressable object storage on your file system (for file
parts: a content-addressable :term:`object storage` on your file system (for file
contents) and a Postgres database (for the graph structure of the archive). See
the :ref:`data-model` for more information. The storage layer is configured via
a YAML configuration file, located at
......@@ -137,13 +137,13 @@ a YAML configuration file, located at
root: /srv/softwareheritage/objects/
slicing: 0:2/2:4
Make sure that the object storage root exists on the filesystem and is writable
Make sure that the :term:`object storage` root exists on the filesystem and is writable
to your user, e.g.::
sudo mkdir -p /srv/softwareheritage/objects
sudo chown "${USER}:" /srv/softwareheritage/objects
You are done with object storage setup! Let's setup the database::
You are done with :term:`object storage` setup! Let's setup the database::
swh-db-init storage -d softwareheritage-dev
......
:orphan:
.. _glossary:
Glossary
========
.. glossary::
archive
An instance of the |swh| data store.
archiver
A component dedicated at replicating an :term:`archive` and ensure there
are enough copies of each element to ensure resiliency.
ark
`Archival Resource Key`_ (ARK) is a Uniform Resource Locator (URL) that is
a multi-purpose persistent identifier for information objects of any type.
artifact
software artifact
An artifact is one of many kinds of tangible by-products produced during
the development of software.
content
blob
A (specific version of a) file stored in the archive, identified by its
cryptographic hashes (SHA1, "git-like" SHA1, SHA256) and its size. Also
known as: :term:`blob`. Note: it is incorrect to refer to Contents as
"files", because files are usually considered to be named, whereas
Contents are nameless. It is only in the context of specific
:term:`directories <directory>` that :term:`contents <content>` acquire
(local) names.
directory
A set of named pointers to contents (file entries), directories (directory
entries) and revisions (revision entries). All entries are associated to
the local name of the entry (i.e., a relative path without any path
separator) and permission metadata (e.g., ``chmod`` value or equivalent).
doi
A Digital Object Identifier or DOI_ is a persistent identifier or handle
used to uniquely identify objects, standardized by the International
Organization for Standardization (ISO).
journal
The :ref:`journal <swh-journal>` is the persistent logger of the |swh| architecture in charge
of logging changes of the archive, with publish-subscribe_ support.
lister
A :ref:`lister <swh-lister>` is a component of the |swh| architecture that is in charge of
enumerating the :term:`software origin` (e.g., VCS, packages, etc.)
available at a source code distribution place.
loader
A :ref:`loader <swh-loader-core>` is a component of the |swh| architecture
responsible for reading a source code :term:`origin` (typically a git
reposiitory) and import or update its content in the :term:`archive` (ie.
add new file contents int :term:`object storage` and repository structure
in the :term:`storage database`).
hash
cryptographic hash
checksum
digest
A fixed-size "summary" of a stream of bytes that is easy to compute, and
hard to reverse. (Cryptographic hash function Wikipedia article) also
known as: :term:`checksum`, :term:`digest`.
indexer
A component of the |swh| architecture dedicated to producing metadata
linked to the known :term:`blobs <blob>` in the :term:`archive`.
objstore
objstorage
object store
object storage
Content-addressable object storage. It is the place where actual object
:term:`blobs <blob>` objects are stored.
origin
software origin
data source
A location from which a coherent set of sources has been obtained, like a
git repository, a directory containing tarballs, etc.
person
An entity referenced by a revision as either the author or the committer
of the corresponding change. A person is associated to a full name and/or
an email address.
release
tag
milestone
a revision that has been marked as noteworthy with a specific name (e.g.,
a version number), together with associated development metadata (e.g.,
author, timestamp, etc).
revision
commit
changeset
A point in time snapshot of the content of a directory, together with
associated development metadata (e.g., author, timestamp, log message,
etc).
scheduler
The component of the |swh| architecture dedicated to the management and
the prioritization of the many tasks.
snapshot
the state of all visible branches during a specific visit of an origin
storage
storage database
The main database of the |swh| platform in which the all the elements of
the :ref:`data-model` but the :term:`content` are stored as a :ref:`Merkle
DAG <swh-merkle-dag>`.
type of origin
Information about the kind of hosting, e.g., whether it is a forge, a
collection of repositories, an homepage publishing tarball, or a one shot
source code repository. For all kind of repositories please specify which
VCS system is in use (Git, SVN, CVS, etc.) object.
vault
vault service
User-facing service that allows to retrieve parts of the :term:`archive`
as self-contained bundles (e.g., individual releases, entire repository
snapshots, etc.)
visit
The passage of |swh| on a given :term:`origin`, to retrieve all source
code and metadata available there at the time. A visit object stores the
state of all visible branches (if any) available at the origin at visit
time; each of them points to a revision object in the archive. Future
visits of the same origin will create new visit objects, without removing
previous ones.
.. _blob: https://en.wikipedia.org/wiki/Binary_large_object
.. _DOI: https://www.doi.org
.. _`persistent identifier`: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#persistent-identifiers
.. _`Archival Resource Key`: http://n2t.net/e/ark_ids.html
.. _publish-subscribe: https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern
......@@ -116,6 +116,7 @@ Indices and tables
* :ref:`modindex`
* `URLs index <http-routingtable.html>`_
* :ref:`search`
* :ref:`glossary`
.. ensure sphinx does not complain about index files not being included
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment