Skip to content

Deploy swh-indexer v3.0.0

Breaking changes:

  • pyproject.toml: Remove swh.workers entrypoint
  • Remove task-based indexing

New features:

  • github & gitea: Index fork relationships as forge:forkedFrom
  • github & gitea: Map schema:programmingLanguage

In order for the richer metadata to be available in swh-search, it needs to be reindexed by resetting the swh-indexer-prod-01-swh.indexer.journal_client.extrinsic_metadata consumer (and its equivalent on staging) to consume from the beginning of the swh.journal.objects.raw_extrinsic_metadata, after updating it

Production rollout plan:

  • Stop the extrinsic_metadata journal clients in the static VMs (puppet / azure)
  • Deploy a read-write indexer storage rpc service in the prod k8s cluster (swh/infra/ci-cd/swh-charts!294 (merged))
  • Deploy the indexer journal clients there (for extrinsic metadata only) with a new consumer group
  • Check that new indexed origin extrinsic metadata objects are happening
  • Decommission the azure journal clients (done through [1])

[1] #5238 (closed)

Edited by Antoine R. Dumont