Skip to content
Snippets Groups Projects
Commit 0012765f authored by vlorentz's avatar vlorentz
Browse files

docs: Add workflow for extrinsic metadata + mention storage on the path from loader to journal

parent 0bb9928c
No related branches found
No related tags found
1 merge request!487docs: Add workflow for extrinsic metadata + mention storage on the path from loader to journal
Pipeline #2009 passed
@startuml
participant LOADERS as "Metadata Loaders"
participant STORAGE as "Graph Storage"
participant JOURNAL as "Journal"
participant IDX_REM_META as "REM Indexer"
participant IDX_STORAGE as "Indexer Storage"
activate IDX_STORAGE
activate STORAGE
activate JOURNAL
activate LOADERS
LOADERS->>STORAGE: new REM (Raw Extrinsic Metadata) object\n for Origin http://example.org/repo.git\nor object swh:1:dir:...
STORAGE->>JOURNAL: new REM object
deactivate LOADERS
JOURNAL->>IDX_REM_META: run indexers on REM object
activate IDX_REM_META
IDX_REM_META->>IDX_REM_META: recognize REM object (gitea/github/deposit/...)
IDX_REM_META->>IDX_REM_META: parse REM object
alt If the REM object describe an origin
IDX_REM_META->>IDX_STORAGE: origin_extrinsic_metadata_add(id="http://example.org/repo.git", {author: "Jane Doe", ...})
IDX_STORAGE->>IDX_REM_META: ok
end
alt If the REM object describe a directory
IDX_REM_META->>IDX_STORAGE: directory_extrinsic_metadata_add(id="swh:1:dir:...", {author: "Jane Doe", ...})
IDX_STORAGE->>IDX_REM_META: ok
end
deactivate IDX_REM_META
@enduml
@startuml
participant LOADERS as "Loaders"
participant STORAGE as "Graph Storage"
participant JOURNAL as "Journal"
participant IDX_ORIG_META as "Origin Metadata Indexer"
participant IDX_ORIG_HEAD as "Origin-Head Indexer"
participant IDX_DIR_META as "Directory Metadata Indexer"
participant IDX_CONT_META as "Content Metadata Indexer"
participant IDX_STORAGE as "Indexer Storage"
participant STORAGE as "Graph Storage"
participant OBJ_STORAGE as "Object Storage"
activate OBJ_STORAGE
......@@ -17,7 +17,9 @@
activate LOADERS
LOADERS->>JOURNAL: Origin http://example.org/repo.git\nwas added/revisited
LOADERS->>STORAGE: Repository content
LOADERS->>STORAGE: Origin http://example.org/repo.git\nwas added/revisited
STORAGE->>JOURNAL: Origin http://example.org/repo.git\nwas added/revisited
deactivate LOADERS
JOURNAL->>IDX_ORIG_META: run indexers on origin\nhttp://example.org/repo.git
......
......@@ -14,7 +14,7 @@ at each step in the indexer storage.
Indexer architecture
^^^^^^^^^^^^^^^^^^^^
.. thumbnail:: images/tasks-metadata-indexers.svg
.. thumbnail:: images/tasks-intrinsic-metadata-indexers.svg
Origin-Head Indexer
......@@ -96,6 +96,11 @@ This normalization makes up for most of the code of
Extrinsic metadata
------------------
Indexer architecture
^^^^^^^^^^^^^^^^^^^^
.. thumbnail:: images/tasks-extrinsic-metadata-indexers.svg
The :term:`extrinsic metadata` indexer works very differently from
the :term:`intrinsic metadata` indexers we saw above.
While the latter extract metadata from software artefacts (files and directories)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment