Skip to content

swh/storage: Deploy raw_extrinsic_metadata content backfiller

Guillaume Samson requested to merge storage_backfiller into production

Related to swh/infra/sysadm-environment#5216 (closed)

These modifications will deploy a backfiller job for content object type with 1 range.

Helm diff
./swh/helm-diff.sh
[swh] Comparing changes between branches production and storage_backfiller (per environment)...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment staging, namespace swh...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra-next-version...
Your branch is up to date with 'origin/storage_backfiller'.
[swh] Generate config in storage_backfiller branch for environment staging...
[swh] Generate config in storage_backfiller branch for environment staging...
[swh] Generate config in storage_backfiller branch for environment staging...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment production, namespace swh...
[swh] Generate config in production branch for environment production, namespace swh-cassandra...
[swh] Generate config in production branch for environment production, namespace swh-cassandra-next-version...
Your branch is up to date with 'origin/storage_backfiller'.
[swh] Generate config in storage_backfiller branch for environment production...
[swh] Generate config in storage_backfiller branch for environment production...
[swh] Generate config in storage_backfiller branch for environment production...


------------- diff for environment staging namespace swh -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.6fcRmRtQ/staging-swh.before, 111 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.6fcRmRtQ/staging-swh.after, 111 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned no differences
        |___/



------------- diff for environment staging namespace swh-cassandra -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.6fcRmRtQ/staging-swh-cassandra.before, 377 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.6fcRmRtQ/staging-swh-cassandra.after, 377 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned no differences
        |___/



------------- diff for environment staging namespace swh-cassandra-next-version -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.6fcRmRtQ/staging-swh-cassandra-next-version.before, 146 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.6fcRmRtQ/staging-swh-cassandra-next-version.after, 146 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned no differences
        |___/



------------- diff for environment production namespace swh -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.6fcRmRtQ/production-swh.before, 410 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.6fcRmRtQ/production-swh.after, 412 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned one difference
        |___/

(file level)
    ---
    # Source: swh/templates/storage/backfiller-configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: storage-backfiller-raw-extrinsic-metadata-content-configuration-template
      namespace: swh
    data:
      config.yml.template: |
        storage:
          cls: postgresql
          db: host=postgresql-storage-rw.internal.softwareheritage.org port=5432 user=guest
            dbname=softwareheritage password=${POSTGRESQL_PASSWORD}
        journal_writer:
          anonymize: true
          brokers:
          - kafka1.internal.softwareheritage.org
          - kafka2.internal.softwareheritage.org
          - kafka3.internal.softwareheritage.org
          - kafka4.internal.softwareheritage.org
          client_id: swh.storage.journal_writer.${HOSTNAME}
          cls: kafka
          prefix: swh.journal.objects
          producer_config:
            message.max.bytes: 1000000000
        
    # Source: swh/templates/storage/backfiller-jobs.yaml
    apiVersion: batch/v1
    kind: Job
    metadata:
      namespace: swh
      name: storage-backfiller-raw-extrinsic-metadata-content-0
      labels:
        app: storage-backfiller-raw-extrinsic-metadata-content-0
    spec:
      backoffLimit: 0
      template:
        metadata:
          labels:
            app: storage-backfiller-raw-extrinsic-metadata-content
          annotations:
            checksum/config: c1fe7604b1f039736e4e2edb16728a614aac1b50c9232f05e8d587d6d1c12241
            checksum/database-utils: 1cc40c53ffcd5d3a71357e55336d394b97a8e3c6fe8cb7aedc0b595cea7f92b7
            checksum/config-utils: d75ca13b805bce6a8ab59c8e24c938f2283108f6a79134f6e71db86308651dc6
        spec:
          restartPolicy: Never
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: swh/backfiller
                    operator: In
                    values:
                    - "true"
          priorityClassName: swh-background-workload
          initContainers:
          - name: prepare-configuration
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/utils:20231211.1"
            imagePullPolicy: IfNotPresent
            command:
            - /entrypoints/prepare-configuration.sh
            env:
            - name: POSTGRESQL_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: postgres-guest-password
                  name: swh-storage-postgresql-common-secret
                  optional: false
            volumeMounts:
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/configuration-template
            - name: config-utils
              mountPath: /entrypoints
              readOnly: true
          - name: check-migration
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/storage:20240603.1"
            command:
            - /entrypoints/check-storage-db-version.sh
            env:
            - name: MODULE
              value: storage
            volumeMounts:
            - name: configuration
              mountPath: /etc/swh
            - name: database-utils
              mountPath: /entrypoints
          containers:
          - name: raw-extrinsic-metadata-content
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/storage:20240603.1"
            imagePullPolicy: IfNotPresent
            resources:
              requests:
                memory: 1200Mi
                cpu: 450m
            command:
            - /opt/swh/entrypoint.sh
            args:
            - swh
            - storage
            - "-C"
            - /etc/swh/config.yml
            - backfill
            - "--start-object"
            - "swh:1:cnt:0000000000000000000000000000000000000000"
            - "--end-object"
            - "swh:1:cnt:ffffffffffffffffffffffffffffffffffffffff"
            - raw_extrinsic_metadata
            env:
            - name: STATSD_HOST
              value: prometheus-statsd-exporter
            - name: STATSD_PORT
              value: 9125
            - name: STATSD_TAGS
              value: "deployment:storage-backfiller-raw-extrinsic-metadata-content"
            - name: SWH_LOG_LEVEL
              value: INFO
            - name: SWH_SENTRY_ENVIRONMENT
              value: production
            - name: SWH_MAIN_PACKAGE
              value: swh.storage
            - name: SWH_SENTRY_DSN
              valueFrom:
                secretKeyRef:
                  name: common-secrets
                  key: storage-sentry-dsn
                  # 'name' secret should exist & include key
    # if the setting doesn't exist, sentry pushes will be disabled
    optional: true
            - name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
              value: "true"
            volumeMounts:
            - name: configuration
              mountPath: /etc/swh
          volumes:
          - name: configuration
            emptyDir: {}
          - name: configuration-template
            configMap:
              name: storage-backfiller-raw-extrinsic-metadata-content-configuration-template
              items:
              - key: config.yml.template
                path: config.yml.template
          - name: database-utils
            configMap:
              name: database-utils
              defaultMode: 0555
          - name: config-utils
            configMap:
              name: config-utils
              defaultMode: 0555
    
  



------------- diff for environment production namespace swh-cassandra -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.6fcRmRtQ/production-swh-cassandra.before, 94 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.6fcRmRtQ/production-swh-cassandra.after, 94 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned no differences
        |___/

Merge request reports