Deploy swh-indexer v3.0.0
Breaking changes:
- pyproject.toml: Remove swh.workers entrypoint
- Remove task-based indexing
New features:
- github & gitea: Index fork relationships as forge:forkedFrom
- github & gitea: Map schema:programmingLanguage
In order for the richer metadata to be available in swh-search, it needs to be reindexed by resetting the swh-indexer-prod-01-swh.indexer.journal_client.extrinsic_metadata
consumer (and its equivalent on staging) to consume from the beginning of the swh.journal.objects.raw_extrinsic_metadata
, after updating it
Production rollout plan:
-
Stop the extrinsic_metadata journal clients in the static VMs (puppet / azure) -
Deploy a read-write indexer storage rpc service in the prod k8s cluster (swh/infra/ci-cd/swh-charts!294 (merged)) -
Deploy the indexer journal clients there (for extrinsic metadata only) with a new consumer group -
Check that new indexed origin extrinsic metadata objects are happening -
Decommission the azure journal clients (done through [1])
[1] #5238 (closed)
Edited by Antoine R. Dumont