swh-graph in production
We need to have an in production swh-graph service, including fully automated periodic exports from storage. This is a meta-task to track all the related activities to achieve this goal.
Migrated from T2220 (view on Phabricator)
Designs
- Show closed items
- swh-vault #887
- swh/meta #2217
- swh/meta #3550
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- David Douard mentioned in merge request !182 (closed)
mentioned in merge request !182 (closed)
- David Douard mentioned in issue swh-vault#887
mentioned in issue swh-vault#887
- Phabricator Migration user marked this issue as related to swh-vault#887
marked this issue as related to swh-vault#887
- Phabricator Migration user marked this issue as related to swh/meta#2204
marked this issue as related to swh/meta#2204
- Phabricator Migration user marked this issue as related to swh/meta#2217
marked this issue as related to swh/meta#2217
added Compressed graph service Roadmap 2021 Roadmap 2022 meta-task labels
- vlorentz added priority:Normal label
added priority:Normal label
- Phabricator Migration user marked this issue as related to swh-vault#3096 (closed)
marked this issue as related to swh-vault#3096 (closed)
- Roberto Di Cosmo changed the description
changed the description
- Phabricator Migration user marked this issue as related to #3161 (closed)
marked this issue as related to #3161 (closed)
- Phabricator Migration user marked this issue as related to swh/meta#3550
marked this issue as related to swh/meta#3550
- Benoit Chauvet added priority:High label and removed priority:Normal label
added priority:High label and removed priority:Normal label
- Maintainer
Graph status meeting
-
Interrogation du graph
- Rocquencourt : Granet 700GO de ram (max atteint)
-
Compression du graph : 1.7TO minimum
- Telecom : machine 4TO
Compression
- Dataset: https://docs.softwareheritage.org/devel/swh-dataset/graph/dataset.html?highlight=dataset
- pipeline de compression : python
- implementation des étapes : org.softwareheritage.grap.compress
TODO
-
Finish GRPC migration (seirl)
- Forge issues to cleanup once GRPC is merged (seirl)
-
Automate deployment (sysadm) >> prepare the command
-
Native hadoop libraries (?)
- #4250 (closed)
- Benchmark perfs with and without the hadoop librairies
-
Luigi ETL[1] pipeline for compression / deployment
-
Integration of the generated javadoc in swh docs (vlorentz)
-
Integration of Java code coverage in the forge
-
Unit test the compression pipeline
-
- Phabricator Migration user mentioned in issue swh/meta#3550
mentioned in issue swh/meta#3550