Skip to content

GitLab

Explore

Sign in
Register

Make next-version environment independent from staging

Context: Next-version is a temporary deployment spawned when a new swh version is ready to be deployed [1]. It currently uses the staging cluster and its services as testbed. This can have side-effects on the staging deployments which can be tricky to analyze/untangle.

To avoid some of those issues, we want to separate next-version's services/backends from staging's.

To that end, we'll need to adapt the current templates so most services can start from scratch (with backend management [2]). So we'll use either minikube (or kind) to test those changes and document this.

[1] When any swh module is tagged, a new swh version is created.

[2] Which is not currently the case (all of our deployed backends are years old so no real backend management got dealt with in the chart's realm so far).

Plan:

#5311 (closed): Allow postgresql database to run in kubernetes
Allow rabbitmq to run in kubernetes
swh/infra/ci-cd/swh-charts!423 (merged): Add scheduler services (so they use both rabbitmq and scheduler db ^)
swh/infra/ci-cd/swh-charts!425 (merged): Initialize storage archive db (data model)
swh/infra/ci-cd/swh-charts!426 (merged): Migrate worker workload to the new storage/scheduler
- Add storage service to use that db ^ (objstorage configured either as a noop or as a memory objstorage)
- Adapt loaders so they consume from the namespaced rabbitmq (and no longer the staging rabbitmq)
- Add lister so listing can happen (filing up scheduler db)
Checks
Fix issues (wrong configuration for loaders and lister still plugged to the previous rabbitmq, fixed)
swh/infra/ci-cd/swh-charts!428 (merged): scrubber: Add backend model management (like the other services)
swh-charts: Add initialize/migrate config flags for most pg backends (scheduler, vault, storage pg, ...)
vault
storage
scheduler
web (already ok)
deposit (already ok)
svix/webhooks: Allow webapp to manage save-code-now events through webhooks
- Dedicate service (already present)
- Initialize backends (already present but done slightly differently [2])
- swh/infra/ci-cd/swh-charts!431 (merged): Allow webhooks initialization through configuration
kafka: Make storage writing backend write to kafka topics
- Determine what operators to use [3]
- strimzi is affiliated to cncf and is community driven (it also has an helm chart)
- swh/infra/ci-cd/swh-charts!432 (merged): Allow kafka to run in kubernetes
- swh/infra/ci-cd/swh-charts!432 (merged): Adapt storage backend to write in kafka instance
- swh/infra/ci-cd/swh-charts!432 (merged): Make journal clients read kafka instance
cassandra: Switch to main writing storage backend and use another storage postgresql as reader
- swh/infra/ci-cd/swh-charts!436 (merged): Adapt storage to use cassandra as writing storage
- swh/infra/ci-cd/swh-charts!436 (merged): Switch reader to use ro-storage-archive-postgresql
- Make replayer fill-in the ro-storage-archive-postgresql
swh/devel/swh-docs!434 (merged): Document how to start the local cluster with the "full" swh stack (ready and running [4])
swh/infra/ci-cd/swh-charts!438 (merged): search: Allow swh.search to use its own elasticsearch backend
- Allow elasticsearch to run in kubernetes (operator)
- Create dedicated elasticsearch cluster
- Adapt swh-search service to use the elasticsearch cluster
- Adapt webapp to use swh-search for... search (& dependent ones) to use it
swh/infra/ci-cd/swh-charts!438 (merged): redis: Add swh.counter service to use its own redis
- Allow redis to run in kubernetes (operator)
- Create dedicated redis cluster
- Adapt swh-counters service to use the redis cluster
- Adapt webapp to use swh-counter for counters
swh/infra/ci-cd/swh-charts!442 (merged): Install a dedicated objstorage
swh/infra/ci-cd/swh-charts!442 (merged): swh.web: Make it browsable from local machine
Enable graphql
swh.alter: Enable dedicated toolbox (with a fake graph reference for now)
Adapt next-version environment with the remaining independent cogs
- ~~svix/webhooks~~ (cannot, it's not multi-deployments)
- cassandra backend
- kafka backend
- objstorage backend
- Add storage instance so writes workload happens in cassandra & the objstorage (which writes to kafka)
- Activate storage-replayer so they replay from cassandra backend to storage postgresql db (another storage instance)
- Add elasticsearch backend
- Activate swh-search rpc
- Activate indexer-storage
- Activate indexers
- Activate search journal client
- Add redis backend
- Activate swh-counters rpc
- Update webapp to use swh-search and swh-counters rpc
- Checks and fixes
  - Use read-only access for storage db access when needed
  - Indexer journal: Fix prefix key
swh/devel/swh-docs!436 (merged): Document how to reset next-version environment

[1] They should be off by default, on for the swh-next-version environment

[2] The postgresql backend is not initialized through the postgresql operator.

[3] https://portworx.com/blog/choosing-the-right-kubernetes-operator-for-apache-kafka/

[4] There might be race condition though (as there is no orchestration within our charts). So small breakages can still occurs the same way current deployments can happen. Also it's mostly "full", some cogs can still be missing (e.g. graph, ...)

Edited Aug 08, 2024 by Antoine R. Dumont

Assignee

Time tracking