Skip to content

swh/staging: Deploy bulk on-demand archival feature

Guillaume Samson requested to merge save_bulk_staging into production

Related to swh/infra/sysadm-environment#5407 (closed)

These modifications will deploy Bulk On-demand archival feature:

  1. add web.save_bulk in Django apps;
  2. create a dedicated lister;
  3. create a dedicated loader;
  4. update the scheduler extra-services template;
  5. create a runner-first-visits scheduler.
Helm-diff
[swh] Comparing changes between branches production and save_bulk_staging (per environment)...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment staging, namespace swh...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra-next-version...
Your branch is up to date with 'origin/save_bulk_staging'.
[swh] Generate config in save_bulk_staging branch for environment staging...
[swh] Generate config in save_bulk_staging branch for environment staging...
[swh] Generate config in save_bulk_staging branch for environment staging...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment production, namespace swh...
[swh] Generate config in production branch for environment production, namespace swh-cassandra...
[swh] Generate config in production branch for environment production, namespace swh-cassandra-next-version...
Your branch is up to date with 'origin/save_bulk_staging'.
[swh] Generate config in save_bulk_staging branch for environment production...
[swh] Generate config in save_bulk_staging branch for environment production...
[swh] Generate config in save_bulk_staging branch for environment production...


------------- diff for environment staging namespace swh -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.g4AQB5iO/staging-swh.before, 141 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.g4AQB5iO/staging-swh.after, 141 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned two differences
        |___/

data.config.yml.template  (v1/ConfigMap/swh/web-postgresql-configuration-template)
  ± value change in multiline text (one insert, no deletions)
    + - swh.web.save_bulk


spec.template.metadata.annotations.checksum/config  (apps/v1/Deployment/swh/web-postgresql)
  ± value change
    - c09953473b423227b00456b299ab769ec349d9faea4795a45ed6ecd9aaadb825
    + fbe71ea8a1ec70ec4445537dcb8f2a0325fbd6aec65a0a30f6209090ce4cafc0



------------- diff for environment staging namespace swh-cassandra -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.g4AQB5iO/staging-swh-cassandra.before, 451 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.g4AQB5iO/staging-swh-cassandra.after, 460 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned five differences
        |___/

(file level)
    ---
    # Source: swh/templates/listers/configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: lister-save-bulk-template
      namespace: swh-cassandra
    data:
      config.yml.template: |
        storage:
          cls: pipeline
          steps:
          - cls: retry
          - cls: remote
            url: http://storage-cassandra-read-only-ingress
        scheduler:
          cls: remote
          url: http://scheduler.internal.staging.swh.network
        celery:
          task_broker: amqp://swhconsumer:${AMQP_PASSWORD}@scheduler0.internal.staging.swh.network:5672/%2f
          task_acks_late: true
          task_queues:
          - swh.lister.save_bulk.tasks.SaveBulkListerTask
        
          sentry_settings_for_celery_tasks:
            __sentry-settings-for-celery-tasks__
        credentials:
          __lister-credentials__
        
      init-container-entrypoint.sh: |
        #!/bin/bash
        
        set -e
        
        CONFIG_FILE=/etc/swh/config.yml
        CONFIG_FILE_WIP=/tmp/wip-config.yml
        
        # substitute environment variables when creating the default config.yml
        eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
          > $CONFIG_FILE
        
        
        SENTRY_SETTINGS_PATH=/etc/credentials/sentry-settings/sentry_settings_for_celery_tasks
        if [ -f $SENTRY_SETTINGS_PATH ]; then
          awk "/__sentry-settings-for-celery-tasks__/{system(\"sed 's/^/    /g' $SENTRY_SETTINGS_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
          mv $CONFIG_FILE_WIP $CONFIG_FILE
        else
          sed -i 's/__sentry-settings-for-celery-tasks__//g' $CONFIG_FILE
        fi
        
        CREDS_LISTER_PATH=/etc/credentials/listers/credentials
        if [ -f $CREDS_LISTER_PATH ]; then
          awk "/__lister-credentials__/{system(\"sed 's/^/  /g' $CREDS_LISTER_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
          mv $CONFIG_FILE_WIP $CONFIG_FILE
        else
          sed -i 's/__lister-credentials__//g' $CONFIG_FILE
        fi
        
        exit 0
        
      logging-configuration.yml: |
        version: 1
        
        handlers:
          console:
            class: logging.StreamHandler
            formatter: json
            stream: ext://sys.stdout
        
        formatters:
          json:
            class: pythonjsonlogger.jsonlogger.JsonFormatter
            # python-json-logger parses the format argument to get the variables it actually expands into the json
            format: "%(asctime)s:%(threadName)s:%(pathname)s:%(lineno)s:%(funcName)s:%(task_name)s:%(task_id)s:%(name)s:%(levelname)s:%(message)s"
        
        loggers:
          celery:
            level: "INFO"
          amqp:
            level: WARNING
          urllib3:
            level: WARNING
          azure.core.pipeline.policies.http_logging_policy:
            level: WARNING
          swh:
            level: "INFO"
          celery.task:
            level: "INFO"
        
        root:
          level: "INFO"
          handlers:
          - console
        
    # Source: swh/templates/loaders/configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: loader-save-bulk-template
      namespace: swh-cassandra
    data:
      config.yml.template: |
        storage:
          cls: pipeline
          steps:
          - cls: buffer
            min_batch_size:
              content: 100
              content_bytes: 52428800
              directory: 100
              directory_entries: 500
              extid: 100
              release: 100
              release_bytes: 52428800
              revision: 100
              revision_bytes: 52428800
              revision_parents: 200
          - cls: filter
          - cls: retry
          - cls: remote
            url: http://storage-cassandra-read-write-ingress
        celery:
          task_broker: amqp://swhconsumer:${AMQP_PASSWORD}@scheduler0.internal.staging.swh.network:5672/%2f
          task_acks_late: true
          task_queues:
          - save-bulk:swh.loader.bzr.tasks.LoadBazaar
          - save-bulk:swh.loader.cvs.tasks.LoadCvsRepository
          - save-bulk:swh.loader.git.tasks.UpdateGitRepository
          - save-bulk:swh.loader.git.tasks.LoadDiskGitRepository
          - save-bulk:swh.loader.git.tasks.UncompressAndLoadDiskGitRepository
          - save-bulk:swh.loader.mercurial.tasks.LoadArchiveMercurial
          - save-bulk:swh.loader.mercurial.tasks.LoadMercurial
          - save-bulk:swh.loader.svn.tasks.LoadSvnRepository
          - save-bulk:swh.loader.svn.tasks.MountAndLoadSvnRepository
          - save-bulk:swh.loader.svn.tasks.DumpMountAndLoadSvnRepository
          - save-bulk:swh.loader.package.archive.tasks.LoadTarball
        
          sentry_settings_for_celery_tasks:
            __sentry-settings-for-celery-tasks__
        metadata_fetcher_credentials:
          __metadata-fetcher-credentials__
        
      init-container-entrypoint.sh: |
        #!/bin/bash
        
        set -e
        
        CONFIG_FILE=/etc/swh/config.yml
        CONFIG_FILE_WIP=/tmp/wip-config.yml
        
        # substitute environment variables when creating the default config.yml
        eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
          > $CONFIG_FILE
        
        
        SENTRY_SETTINGS_PATH=/etc/credentials/sentry-settings/sentry_settings_for_celery_tasks
        if [ -f $SENTRY_SETTINGS_PATH ]; then
          awk "/__sentry-settings-for-celery-tasks__/{system(\"sed 's/^/    /g' $SENTRY_SETTINGS_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
          mv $CONFIG_FILE_WIP $CONFIG_FILE
        else
          sed -i 's/__sentry-settings-for-celery-tasks__//g' $CONFIG_FILE
        fi
        
        CREDS_LISTER_PATH=/etc/credentials/metadata-fetcher/credentials
        if [ -f $CREDS_LISTER_PATH ]; then
          awk "/__metadata-fetcher-credentials__/{system(\"sed 's/^/  /g' $CREDS_LISTER_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
          mv $CONFIG_FILE_WIP $CONFIG_FILE
        else
          sed -i 's/__metadata-fetcher-credentials__//g' $CONFIG_FILE
        fi
        
        exit 0
        
      logging-configuration.yml: |
        version: 1
        
        handlers:
          console:
            class: logging.StreamHandler
            formatter: json
            stream: ext://sys.stdout
        
        formatters:
          json:
            class: pythonjsonlogger.jsonlogger.JsonFormatter
            # python-json-logger parses the format argument to get the variables it actually expands into the json
            format: "%(asctime)s:%(threadName)s:%(pathname)s:%(lineno)s:%(funcName)s:%(task_name)s:%(task_id)s:%(name)s:%(levelname)s:%(message)s"
        
        loggers:
          celery:
            level: "INFO"
          amqp:
            level: WARNING
          urllib3:
            level: WARNING
          azure.core.pipeline.policies.http_logging_policy:
            level: WARNING
          swh:
            level: "INFO"
          celery.task:
            level: "INFO"
        
        root:
          level: "INFO"
          handlers:
          - console
        
    # Source: swh/templates/listers/deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: lister-save-bulk
      namespace: swh-cassandra
      labels:
        app: lister-save-bulk
    spec:
      revisionHistoryLimit: 2
      selector:
        matchLabels:
          app: lister-save-bulk
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxSurge: 1
      template:
        metadata:
          labels:
            app: lister-save-bulk
          annotations:
            # Force a rollout upgrade if the configuration changes
    checksum/config: 7112d1084fbc15ba336705ada811864a97de3f3a2119da390524ef4046c34931
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: swh/lister
                    operator: In
                    values:
                    - "true"
          priorityClassName: swh-cassandra-normal-workload
          terminationGracePeriodSeconds: 3600
          initContainers:
          - name: prepare-configuration
            image: "debian:bullseye"
            imagePullPolicy: IfNotPresent
            env:
            - name: AMQP_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: swhconsumer-password
                  name: amqp-secrets
                  optional: false
            command:
            - /entrypoint.sh
            volumeMounts:
            - name: configuration-template
              mountPath: /entrypoint.sh
              subPath: init-container-entrypoint.sh
              readOnly: true
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/configuration-template
            - name: lister-credentials-secrets
              mountPath: /etc/credentials/listers
              readOnly: true
            - name: sentry-settings-for-celery-tasks
              mountPath: /etc/credentials/sentry-settings
              readOnly: true
          containers:
          - name: listers
            resources:
              requests:
                memory: 256Mi
                cpu: 250m
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/lister:20241014.2"
            imagePullPolicy: IfNotPresent
            command:
            - /bin/bash
            args:
            - "-c"
            - /opt/swh/entrypoint.sh
            lifecycle:
              preStop:
                exec:
                  command:
                  - /pre-stop.sh
            env:
            - name: STATSD_HOST
              value: prometheus-statsd-exporter
            - name: STATSD_PORT
              value: 9125
            - name: STATSD_TAGS
              value: "deployment:lister-save-bulk"
            - name: MAX_TASKS_PER_CHILD
              value: 1
            - name: SWH_LOG_LEVEL
              value: INFO
            - name: SWH_CONFIG_FILENAME
              value: /etc/swh/config.yml
            - name: SWH_LOG_CONFIG
              value: /etc/swh/logging-configuration.yml
            - name: SWH_SENTRY_ENVIRONMENT
              value: staging
            - name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
              value: yes
            volumeMounts:
            - name: lister-utils
              mountPath: /pre-stop.sh
              subPath: pre-stop.sh
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/logging-configuration.yml
              subPath: logging-configuration.yml
              readOnly: true
          volumes:
          - name: configuration
            ephemeral:
              volumeClaimTemplate:
                metadata:
                  labels:
                    type: ephemeral-volume
                spec:
                  accessModes:
                  - ReadWriteOnce
                  resources:
                    requests:
                      storage: 100Gi
                  storageClassName: local-path
          - name: configuration-template
            configMap:
              name: lister-save-bulk-template
              defaultMode: 0777
              items:
              - key: config.yml.template
                path: config.yml.template
              - key: init-container-entrypoint.sh
                path: init-container-entrypoint.sh
              - key: logging-configuration.yml
                path: logging-configuration.yml
          - name: lister-utils
            configMap:
              name: lister-utils
              defaultMode: 0777
              items:
              - key: pre-stop-idempotent.sh
                path: pre-stop.sh
          - name: lister-credentials-secrets
            secret:
              secretName: lister-credentials-secrets
              optional: true
          - name: sentry-settings-for-celery-tasks
            secret:
              secretName: sentry-settings-for-celery-tasks
              optional: true
    # Source: swh/templates/loaders/deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: loader-save-bulk
      namespace: swh-cassandra
      labels:
        app: loader-save-bulk
    spec:
      revisionHistoryLimit: 2
      selector:
        matchLabels:
          app: loader-save-bulk
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxSurge: 1
      template:
        metadata:
          labels:
            app: loader-save-bulk
          annotations:
            # Force a rollout upgrade if the configuration changes
    checksum/config: 99e0abb538a030866ff7ad7209bac7c693de9e96ac792112ab1af36f1b60fd46
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: swh/loader
                    operator: In
                    values:
                    - "true"
          priorityClassName: swh-cassandra-normal-workload
          terminationGracePeriodSeconds: 3600
          dnsConfig:
            options:
            - name: ndots
              value: 1
            searches:
            - cluster.local
            - svc.cluster.local
            - swh-cassandra.svc.cluster.local
          initContainers:
          - name: prepare-configuration
            image: "debian:bullseye"
            imagePullPolicy: IfNotPresent
            env:
            - name: AMQP_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: swhconsumer-password
                  name: amqp-secrets
                  optional: false
            command:
            - /entrypoint.sh
            volumeMounts:
            - name: configuration-template
              mountPath: /entrypoint.sh
              subPath: init-container-entrypoint.sh
              readOnly: true
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/configuration-template
            - name: metadata-fetcher-credentials
              mountPath: /etc/credentials/metadata-fetcher
              readOnly: true
            - name: sentry-settings-for-celery-tasks
              mountPath: /etc/credentials/sentry-settings
              readOnly: true
          containers:
          - name: loaders
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/loader_savecodenow:20241014.1"
            imagePullPolicy: IfNotPresent
            command:
            - /opt/swh/entrypoint.sh
            resources:
              requests:
                memory: 200Mi
                cpu: 50m
            lifecycle:
              preStop:
                exec:
                  command:
                  - /pre-stop.sh
            env:
            - name: STATSD_HOST
              value: prometheus-statsd-exporter
            - name: STATSD_PORT
              value: 9125
            - name: STATSD_TAGS
              value: "deployment:loader-save-bulk"
            - name: MAX_TASKS_PER_CHILD
              value: 10
            - name: SWH_LOG_LEVEL
              value: INFO
            - name: SWH_CONFIG_FILENAME
              value: /etc/swh/config.yml
            - name: SWH_LOG_CONFIG
              value: /etc/swh/logging-configuration.yml
            - name: SWH_SENTRY_ENVIRONMENT
              value: staging
            - name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
              value: yes
            volumeMounts:
            - name: loader-utils
              mountPath: /pre-stop.sh
              subPath: pre-stop.sh
            - name: configuration
              mountPath: /etc/swh
            - name: localstorage
              mountPath: /tmp
            - name: configuration-template
              mountPath: /etc/swh/logging-configuration.yml
              subPath: logging-configuration.yml
              readOnly: true
          volumes:
          - name: localstorage
            ephemeral:
              volumeClaimTemplate:
                metadata:
                  labels:
                    type: ephemeral-volume
                spec:
                  accessModes:
                  - ReadWriteOnce
                  resources:
                    requests:
                      storage: 100Gi
                  storageClassName: local-path
          - name: configuration
            emptyDir: {}
          - name: configuration-template
            configMap:
              name: loader-save-bulk-template
              defaultMode: 0777
              items:
              - key: config.yml.template
                path: config.yml.template
              - key: init-container-entrypoint.sh
                path: init-container-entrypoint.sh
              - key: logging-configuration.yml
                path: logging-configuration.yml
          - name: loader-utils
            configMap:
              name: loader-utils
              defaultMode: 0777
              items:
              - key: pre-stop-idempotent.sh
                path: pre-stop.sh
          - name: metadata-fetcher-credentials
            secret:
              secretName: metadata-fetcher-credentials
              optional: true
          - name: sentry-settings-for-celery-tasks
            secret:
              secretName: sentry-settings-for-celery-tasks
              optional: true
    # Source: swh/templates/scheduler/extra-services-deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      namespace: swh-cassandra
      name: scheduler-runner-first-visits
      labels:
        app: scheduler-runner-first-visits
    spec:
      revisionHistoryLimit: 2
      replicas: 1
      selector:
        matchLabels:
          app: scheduler-runner-first-visits
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxSurge: 1
      template:
        metadata:
          labels:
            app: scheduler-runner-first-visits
          annotations:
            checksum/config: 4c7048bda6a4e2c34e0e15f9598c1344a57a2a6eb0a3b8c844ce0e70742bce0b
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: swh/scheduler
                    operator: In
                    values:
                    - "true"
          priorityClassName: swh-cassandra-frontend-rpc-workload
          initContainers:
          - name: prepare-configuration
            image: "debian:bullseye"
            imagePullPolicy: IfNotPresent
            command:
            - /bin/bash
            args:
            - "-c"
            - "eval echo "\"$(</etc/swh/configuration-template/config.yml.template)\"" > /etc/swh/config.yml"
            env:
            - name: AMQP_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: swhproducer-password
                  name: amqp-secrets
                  optional: false
            volumeMounts:
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/configuration-template
          containers:
          - name: scheduler-runner-first-visits
            resources:
              requests:
                memory: 100Mi
                cpu: 10m
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/scheduler:20241014.1"
            command:
            - /opt/swh/entrypoint.sh
            args:
            - swh
            - scheduler
            - "--config-file"
            - /etc/swh/config.yml
            - start-runner-first-visits
            - "--period"
            - 10
            env:
            - name: STATSD_HOST
              value: prometheus-statsd-exporter
            - name: STATSD_PORT
              value: 9125
            - name: STATSD_TAGS
              value: "deployment:scheduler-runner-first-visits"
            - name: SWH_CONFIG_FILENAME
              value: /etc/swh/config.yml
            - name: SWH_LOG_LEVEL
              value: INFO
            - name: SWH_SENTRY_ENVIRONMENT
              value: staging
            - name: SWH_MAIN_PACKAGE
              value: swh.scheduler
            - name: SWH_SENTRY_DSN
              valueFrom:
                secretKeyRef:
                  name: scheduler-sentry-secrets
                  key: sentry-dsn
                  # if the setting doesn't exist, sentry issue pushes will be disabled
    optional: false
            - name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
              value: "true"
            imagePullPolicy: IfNotPresent
            volumeMounts:
            - name: configuration
              mountPath: /etc/swh
          volumes:
          - name: configuration
            emptyDir: {}
          - name: configuration-template
            configMap:
              name: extra-services-configuration-template
              items:
              - key: config.yml.template
                path: config.yml.template
    # Source: swh/templates/listers/keda-autoscaling.yaml
    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: lister-save-bulk-operators
      namespace: swh-cassandra
    spec:
      scaleTargetRef:
        apiVersion: apps/v1 # Optional. Default: apps/v1
        kind: Deployment # Optional. Default: Deployment
        # Mandatory. Must be in same namespace as ScaledObject
    name: lister-save-bulk
    # envSourceContainerName: {container-name} # Optional. Default:
    # .spec.template.spec.containers[0]
      pollingInterval: 30 # Optional. Default: 30 seconds
      cooldownPeriod: 3600
      # ^ Optional. Default: 300 seconds
    idleReplicaCount: 0 # Set to 0 to stop all the workers when
      # there is no activity on the queue
    minReplicaCount: 0
      maxReplicaCount: 1
      triggers:
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-lister-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 1
          queueName: swh.lister.save_bulk.tasks.SaveBulkListerTask
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
    # Source: swh/templates/loaders/keda-autoscaling.yaml
    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: loader-save-bulk-operators
      namespace: swh-cassandra
    spec:
      scaleTargetRef:
        apiVersion: apps/v1 # Optional. Default: apps/v1
        kind: Deployment # Optional. Default: Deployment
        # Mandatory. Must be in same namespace as ScaledObject
    name: loader-save-bulk
    # envSourceContainerName: {container-name} # Optional. Default:
    # .spec.template.spec.containers[0]
      pollingInterval: 30 # Optional. Default: 30 seconds
      cooldownPeriod: 300
      # ^ Optional. Default: 300 seconds
    idleReplicaCount: 0 # Set to 0 to stop all the workers when
      # there is no activity on the queue
    minReplicaCount: 0
      maxReplicaCount: 1
      triggers:
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.bzr.tasks.LoadBazaar"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.cvs.tasks.LoadCvsRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.git.tasks.UpdateGitRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.git.tasks.LoadDiskGitRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.git.tasks.UncompressAndLoadDiskGitRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.mercurial.tasks.LoadArchiveMercurial"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.mercurial.tasks.LoadMercurial"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.svn.tasks.LoadSvnRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.svn.tasks.MountAndLoadSvnRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.svn.tasks.DumpMountAndLoadSvnRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.package.archive.tasks.LoadTarball"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
    # Source: swh/templates/listers/keda-autoscaling.yaml
    apiVersion: keda.sh/v1alpha1
    kind: TriggerAuthentication
    metadata:
      name: amqp-authentication-lister-save-bulk
      namespace: swh-cassandra
    spec:
      secretTargetRef:
      - parameter: host # "host" is required by the scalerObject trigger metadata
        name: common-secrets
        key: rabbitmq-http-host
    # Source: swh/templates/loaders/keda-autoscaling.yaml
    apiVersion: keda.sh/v1alpha1
    kind: TriggerAuthentication
    metadata:
      name: amqp-authentication-loader-save-bulk
      namespace: swh-cassandra
    spec:
      secretTargetRef:
      - parameter: host # "host" is required by the scalerObject trigger metadata
        name: common-secrets
        key: rabbitmq-http-host
    
  

data.config.yml.template  (v1/ConfigMap/swh-cassandra/web-cassandra-configuration-template)
  ± value change in multiline text (one insert, no deletions)
    + - swh.web.save_bulk


data.config.yml.template  (v1/ConfigMap/swh-cassandra/web-webhooks-configuration-template)
  ± value change in multiline text (one insert, no deletions)
    + - swh.web.save_bulk


spec.template.metadata.annotations.checksum/config  (apps/v1/Deployment/swh-cassandra/web-cassandra)
  ± value change
    - 63f91d0d954ad4733deadfe303d8a447a3be38dcf31d5c035b6d30ddccc42934
    + 256e0e2a2a2069b1c01681de89ea1778456dac1cbdb757bd4daa2cdba1f78a12

spec.template.metadata.annotations.checksum/config  (apps/v1/Deployment/swh-cassandra/web-webhooks)
  ± value change
    - 273830564690f1c2225558337f98e73c7003c012612c807bc1a7d32d9340e752
    + 6611b6d5b581451b710773f38a8562ae154b340589d6d871b478ea131091a23a



------------- diff for environment staging namespace swh-cassandra-next-version -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.g4AQB5iO/staging-swh-cassandra-next-version.before, 360 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.g4AQB5iO/staging-swh-cassandra-next-version.after, 369 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned three differences
        |___/

(file level)
    ---
    # Source: swh/templates/listers/configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: lister-save-bulk-template
      namespace: swh-cassandra-next-version
    data:
      config.yml.template: |
        storage:
          cls: pipeline
          steps:
          - cls: retry
          - cls: remote
            url: http://storage-ro-postgresql:5002
        scheduler:
          cls: remote
          url: http://scheduler-rpc:5008
        celery:
          task_broker: amqp://${AMQP_USERNAME}:${AMQP_PASSWORD}@rabbitmq-scheduler:5672/%2f
          task_acks_late: true
          task_queues:
          - swh.lister.save_bulk.tasks.SaveBulkListerTask
        
          sentry_settings_for_celery_tasks:
            __sentry-settings-for-celery-tasks__
        credentials:
          __lister-credentials__
        
      init-container-entrypoint.sh: |
        #!/bin/bash
        
        set -e
        
        CONFIG_FILE=/etc/swh/config.yml
        CONFIG_FILE_WIP=/tmp/wip-config.yml
        
        # substitute environment variables when creating the default config.yml
        eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
          > $CONFIG_FILE
        
        
        SENTRY_SETTINGS_PATH=/etc/credentials/sentry-settings/sentry_settings_for_celery_tasks
        if [ -f $SENTRY_SETTINGS_PATH ]; then
          awk "/__sentry-settings-for-celery-tasks__/{system(\"sed 's/^/    /g' $SENTRY_SETTINGS_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
          mv $CONFIG_FILE_WIP $CONFIG_FILE
        else
          sed -i 's/__sentry-settings-for-celery-tasks__//g' $CONFIG_FILE
        fi
        
        CREDS_LISTER_PATH=/etc/credentials/listers/credentials
        if [ -f $CREDS_LISTER_PATH ]; then
          awk "/__lister-credentials__/{system(\"sed 's/^/  /g' $CREDS_LISTER_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
          mv $CONFIG_FILE_WIP $CONFIG_FILE
        else
          sed -i 's/__lister-credentials__//g' $CONFIG_FILE
        fi
        
        exit 0
        
      logging-configuration.yml: |
        version: 1
        
        handlers:
          console:
            class: logging.StreamHandler
            formatter: json
            stream: ext://sys.stdout
        
        formatters:
          json:
            class: pythonjsonlogger.jsonlogger.JsonFormatter
            # python-json-logger parses the format argument to get the variables it actually expands into the json
            format: "%(asctime)s:%(threadName)s:%(pathname)s:%(lineno)s:%(funcName)s:%(task_name)s:%(task_id)s:%(name)s:%(levelname)s:%(message)s"
        
        loggers:
          celery:
            level: "INFO"
          amqp:
            level: WARNING
          urllib3:
            level: WARNING
          azure.core.pipeline.policies.http_logging_policy:
            level: WARNING
          swh:
            level: "INFO"
          celery.task:
            level: "INFO"
        
        root:
          level: "INFO"
          handlers:
          - console
        
    # Source: swh/templates/loaders/configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: loader-save-bulk-template
      namespace: swh-cassandra-next-version
    data:
      config.yml.template: |
        storage:
          cls: pipeline
          steps:
          - cls: buffer
            min_batch_size:
              content: 100
              content_bytes: 52428800
              directory: 100
              directory_entries: 500
              extid: 100
              release: 100
              release_bytes: 52428800
              revision: 100
              revision_bytes: 52428800
              revision_parents: 200
          - cls: filter
          - cls: retry
          - cls: remote
            url: http://storage-rw-cassandra:5002
        celery:
          task_broker: amqp://${AMQP_USERNAME}:${AMQP_PASSWORD}@rabbitmq-scheduler:5672/%2f
          task_acks_late: true
          task_queues:
          - save-bulk:swh.loader.bzr.tasks.LoadBazaar
          - save-bulk:swh.loader.cvs.tasks.LoadCvsRepository
          - save-bulk:swh.loader.git.tasks.UpdateGitRepository
          - save-bulk:swh.loader.git.tasks.LoadDiskGitRepository
          - save-bulk:swh.loader.git.tasks.UncompressAndLoadDiskGitRepository
          - save-bulk:swh.loader.mercurial.tasks.LoadArchiveMercurial
          - save-bulk:swh.loader.mercurial.tasks.LoadMercurial
          - save-bulk:swh.loader.svn.tasks.LoadSvnRepository
          - save-bulk:swh.loader.svn.tasks.MountAndLoadSvnRepository
          - save-bulk:swh.loader.svn.tasks.DumpMountAndLoadSvnRepository
          - save-bulk:swh.loader.package.archive.tasks.LoadTarball
        
          sentry_settings_for_celery_tasks:
            __sentry-settings-for-celery-tasks__
        metadata_fetcher_credentials:
          __metadata-fetcher-credentials__
        
      init-container-entrypoint.sh: |
        #!/bin/bash
        
        set -e
        
        CONFIG_FILE=/etc/swh/config.yml
        CONFIG_FILE_WIP=/tmp/wip-config.yml
        
        # substitute environment variables when creating the default config.yml
        eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
          > $CONFIG_FILE
        
        
        SENTRY_SETTINGS_PATH=/etc/credentials/sentry-settings/sentry_settings_for_celery_tasks
        if [ -f $SENTRY_SETTINGS_PATH ]; then
          awk "/__sentry-settings-for-celery-tasks__/{system(\"sed 's/^/    /g' $SENTRY_SETTINGS_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
          mv $CONFIG_FILE_WIP $CONFIG_FILE
        else
          sed -i 's/__sentry-settings-for-celery-tasks__//g' $CONFIG_FILE
        fi
        
        CREDS_LISTER_PATH=/etc/credentials/metadata-fetcher/credentials
        if [ -f $CREDS_LISTER_PATH ]; then
          awk "/__metadata-fetcher-credentials__/{system(\"sed 's/^/  /g' $CREDS_LISTER_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
          mv $CONFIG_FILE_WIP $CONFIG_FILE
        else
          sed -i 's/__metadata-fetcher-credentials__//g' $CONFIG_FILE
        fi
        
        exit 0
        
      logging-configuration.yml: |
        version: 1
        
        handlers:
          console:
            class: logging.StreamHandler
            formatter: json
            stream: ext://sys.stdout
        
        formatters:
          json:
            class: pythonjsonlogger.jsonlogger.JsonFormatter
            # python-json-logger parses the format argument to get the variables it actually expands into the json
            format: "%(asctime)s:%(threadName)s:%(pathname)s:%(lineno)s:%(funcName)s:%(task_name)s:%(task_id)s:%(name)s:%(levelname)s:%(message)s"
        
        loggers:
          celery:
            level: "INFO"
          amqp:
            level: WARNING
          urllib3:
            level: WARNING
          azure.core.pipeline.policies.http_logging_policy:
            level: WARNING
          swh:
            level: "INFO"
          celery.task:
            level: "INFO"
        
        root:
          level: "INFO"
          handlers:
          - console
        
    # Source: swh/templates/listers/deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: lister-save-bulk
      namespace: swh-cassandra-next-version
      labels:
        app: lister-save-bulk
    spec:
      revisionHistoryLimit: 2
      selector:
        matchLabels:
          app: lister-save-bulk
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxSurge: 1
      template:
        metadata:
          labels:
            app: lister-save-bulk
          annotations:
            # Force a rollout upgrade if the configuration changes
    checksum/config: c1b16646658edcbaf1d42fc623656c7b6f3520f3d2927a3e4a302eb3314676c9
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: swh/lister
                    operator: In
                    values:
                    - "true"
          priorityClassName: swh-cassandra-next-version-normal-workload
          terminationGracePeriodSeconds: 3600
          initContainers:
          - name: prepare-configuration
            image: "debian:bullseye"
            imagePullPolicy: IfNotPresent
            env:
            - name: AMQP_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: password
                  name: rabbitmq-scheduler-secret
                  optional: false
            - name: AMQP_USERNAME
              valueFrom:
                secretKeyRef:
                  key: username
                  name: rabbitmq-scheduler-secret
                  optional: false
            command:
            - /entrypoint.sh
            volumeMounts:
            - name: configuration-template
              mountPath: /entrypoint.sh
              subPath: init-container-entrypoint.sh
              readOnly: true
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/configuration-template
            - name: lister-credentials-secrets
              mountPath: /etc/credentials/listers
              readOnly: true
            - name: sentry-settings-for-celery-tasks
              mountPath: /etc/credentials/sentry-settings
              readOnly: true
          containers:
          - name: listers
            resources:
              requests:
                memory: 256Mi
                cpu: 250m
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/lister:20241014.2"
            imagePullPolicy: IfNotPresent
            command:
            - /bin/bash
            args:
            - "-c"
            - /opt/swh/entrypoint.sh
            lifecycle:
              preStop:
                exec:
                  command:
                  - /pre-stop.sh
            env:
            - name: STATSD_HOST
              value: prometheus-statsd-exporter
            - name: STATSD_PORT
              value: 9125
            - name: STATSD_TAGS
              value: "deployment:lister-save-bulk"
            - name: MAX_TASKS_PER_CHILD
              value: 1
            - name: SWH_LOG_LEVEL
              value: INFO
            - name: SWH_CONFIG_FILENAME
              value: /etc/swh/config.yml
            - name: SWH_LOG_CONFIG
              value: /etc/swh/logging-configuration.yml
            - name: SWH_SENTRY_ENVIRONMENT
              value: staging
            - name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
              value: yes
            volumeMounts:
            - name: lister-utils
              mountPath: /pre-stop.sh
              subPath: pre-stop.sh
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/logging-configuration.yml
              subPath: logging-configuration.yml
              readOnly: true
          volumes:
          - name: configuration
            ephemeral:
              volumeClaimTemplate:
                metadata:
                  labels:
                    type: ephemeral-volume
                spec:
                  accessModes:
                  - ReadWriteOnce
                  resources:
                    requests:
                      storage: 100Gi
                  storageClassName: local-path
          - name: configuration-template
            configMap:
              name: lister-save-bulk-template
              defaultMode: 0777
              items:
              - key: config.yml.template
                path: config.yml.template
              - key: init-container-entrypoint.sh
                path: init-container-entrypoint.sh
              - key: logging-configuration.yml
                path: logging-configuration.yml
          - name: lister-utils
            configMap:
              name: lister-utils
              defaultMode: 0777
              items:
              - key: pre-stop-idempotent.sh
                path: pre-stop.sh
          - name: lister-credentials-secrets
            secret:
              secretName: lister-credentials-secrets
              optional: true
          - name: sentry-settings-for-celery-tasks
            secret:
              secretName: sentry-settings-for-celery-tasks
              optional: true
    # Source: swh/templates/loaders/deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: loader-save-bulk
      namespace: swh-cassandra-next-version
      labels:
        app: loader-save-bulk
    spec:
      revisionHistoryLimit: 2
      selector:
        matchLabels:
          app: loader-save-bulk
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxSurge: 1
      template:
        metadata:
          labels:
            app: loader-save-bulk
          annotations:
            # Force a rollout upgrade if the configuration changes
    checksum/config: 52ca5c06adc7643d075a12667ffaa81fda770a53e4854e07283fcaf84d96d7fc
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: swh/loader
                    operator: In
                    values:
                    - "true"
          priorityClassName: swh-cassandra-next-version-normal-workload
          terminationGracePeriodSeconds: 60
          dnsConfig:
            options:
            - name: ndots
              value: 1
            searches:
            - cluster.local
            - svc.cluster.local
            - swh-cassandra-next-version.svc.cluster.local
          initContainers:
          - name: prepare-configuration
            image: "debian:bullseye"
            imagePullPolicy: IfNotPresent
            env:
            - name: AMQP_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: password
                  name: rabbitmq-scheduler-secret
                  optional: false
            - name: AMQP_USERNAME
              valueFrom:
                secretKeyRef:
                  key: username
                  name: rabbitmq-scheduler-secret
                  optional: false
            command:
            - /entrypoint.sh
            volumeMounts:
            - name: configuration-template
              mountPath: /entrypoint.sh
              subPath: init-container-entrypoint.sh
              readOnly: true
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/configuration-template
            - name: metadata-fetcher-credentials
              mountPath: /etc/credentials/metadata-fetcher
              readOnly: true
            - name: sentry-settings-for-celery-tasks
              mountPath: /etc/credentials/sentry-settings
              readOnly: true
          containers:
          - name: loaders
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/loader_savecodenow:20241014.1"
            imagePullPolicy: IfNotPresent
            command:
            - /opt/swh/entrypoint.sh
            resources:
              requests:
                memory: 200Mi
                cpu: 50m
            lifecycle:
              preStop:
                exec:
                  command:
                  - /pre-stop.sh
            env:
            - name: STATSD_HOST
              value: prometheus-statsd-exporter
            - name: STATSD_PORT
              value: 9125
            - name: STATSD_TAGS
              value: "deployment:loader-save-bulk"
            - name: MAX_TASKS_PER_CHILD
              value: 10
            - name: SWH_LOG_LEVEL
              value: INFO
            - name: SWH_CONFIG_FILENAME
              value: /etc/swh/config.yml
            - name: SWH_LOG_CONFIG
              value: /etc/swh/logging-configuration.yml
            - name: SWH_SENTRY_ENVIRONMENT
              value: staging
            - name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
              value: yes
            volumeMounts:
            - name: loader-utils
              mountPath: /pre-stop.sh
              subPath: pre-stop.sh
            - name: configuration
              mountPath: /etc/swh
            - name: localstorage
              mountPath: /tmp
            - name: configuration-template
              mountPath: /etc/swh/logging-configuration.yml
              subPath: logging-configuration.yml
              readOnly: true
          volumes:
          - name: localstorage
            ephemeral:
              volumeClaimTemplate:
                metadata:
                  labels:
                    type: ephemeral-volume
                spec:
                  accessModes:
                  - ReadWriteOnce
                  resources:
                    requests:
                      storage: 100Gi
                  storageClassName: local-path
          - name: configuration
            emptyDir: {}
          - name: configuration-template
            configMap:
              name: loader-save-bulk-template
              defaultMode: 0777
              items:
              - key: config.yml.template
                path: config.yml.template
              - key: init-container-entrypoint.sh
                path: init-container-entrypoint.sh
              - key: logging-configuration.yml
                path: logging-configuration.yml
          - name: loader-utils
            configMap:
              name: loader-utils
              defaultMode: 0777
              items:
              - key: pre-stop-idempotent.sh
                path: pre-stop.sh
          - name: metadata-fetcher-credentials
            secret:
              secretName: metadata-fetcher-credentials
              optional: true
          - name: sentry-settings-for-celery-tasks
            secret:
              secretName: sentry-settings-for-celery-tasks
              optional: true
    # Source: swh/templates/scheduler/extra-services-deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      namespace: swh-cassandra-next-version
      name: scheduler-runner-first-visits
      labels:
        app: scheduler-runner-first-visits
    spec:
      revisionHistoryLimit: 2
      replicas: 1
      selector:
        matchLabels:
          app: scheduler-runner-first-visits
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxSurge: 1
      template:
        metadata:
          labels:
            app: scheduler-runner-first-visits
          annotations:
            checksum/config: 63de62e998530279a875c5b3d9fd3628ab2989d523a1a56468129cc8f3f31507
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: swh/scheduler
                    operator: In
                    values:
                    - "true"
          priorityClassName: swh-cassandra-next-version-frontend-rpc-workload
          initContainers:
          - name: prepare-configuration
            image: "debian:bullseye"
            imagePullPolicy: IfNotPresent
            command:
            - /bin/bash
            args:
            - "-c"
            - "eval echo "\"$(</etc/swh/configuration-template/config.yml.template)\"" > /etc/swh/config.yml"
            env:
            - name: AMQP_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: password
                  name: rabbitmq-scheduler-secret
                  optional: false
            - name: AMQP_USERNAME
              valueFrom:
                secretKeyRef:
                  key: username
                  name: rabbitmq-scheduler-secret
                  optional: false
            volumeMounts:
            - name: configuration
              mountPath: /etc/swh
            - name: configuration-template
              mountPath: /etc/swh/configuration-template
          containers:
          - name: scheduler-runner-first-visits
            resources:
              requests:
                memory: 100Mi
                cpu: 10m
            image: "container-registry.softwareheritage.org/swh/infra/swh-apps/scheduler:20241014.1"
            command:
            - /opt/swh/entrypoint.sh
            args:
            - swh
            - scheduler
            - "--config-file"
            - /etc/swh/config.yml
            - start-runner-first-visits
            - "--period"
            - 10
            env:
            - name: STATSD_HOST
              value: prometheus-statsd-exporter
            - name: STATSD_PORT
              value: 9125
            - name: STATSD_TAGS
              value: "deployment:scheduler-runner-first-visits"
            - name: SWH_CONFIG_FILENAME
              value: /etc/swh/config.yml
            - name: SWH_LOG_LEVEL
              value: INFO
            imagePullPolicy: IfNotPresent
            volumeMounts:
            - name: configuration
              mountPath: /etc/swh
          volumes:
          - name: configuration
            emptyDir: {}
          - name: configuration-template
            configMap:
              name: extra-services-configuration-template
              items:
              - key: config.yml.template
                path: config.yml.template
    # Source: swh/templates/listers/keda-autoscaling.yaml
    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: lister-save-bulk-operators
      namespace: swh-cassandra-next-version
    spec:
      scaleTargetRef:
        apiVersion: apps/v1 # Optional. Default: apps/v1
        kind: Deployment # Optional. Default: Deployment
        # Mandatory. Must be in same namespace as ScaledObject
    name: lister-save-bulk
    # envSourceContainerName: {container-name} # Optional. Default:
    # .spec.template.spec.containers[0]
      pollingInterval: 30 # Optional. Default: 30 seconds
      cooldownPeriod: 3600
      # ^ Optional. Default: 300 seconds
    idleReplicaCount: 0 # Set to 0 to stop all the workers when
      # there is no activity on the queue
    minReplicaCount: 0
      maxReplicaCount: 1
      triggers:
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-lister-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 1
          queueName: swh.lister.save_bulk.tasks.SaveBulkListerTask
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
    # Source: swh/templates/loaders/keda-autoscaling.yaml
    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: loader-save-bulk-operators
      namespace: swh-cassandra-next-version
    spec:
      scaleTargetRef:
        apiVersion: apps/v1 # Optional. Default: apps/v1
        kind: Deployment # Optional. Default: Deployment
        # Mandatory. Must be in same namespace as ScaledObject
    name: loader-save-bulk
    # envSourceContainerName: {container-name} # Optional. Default:
    # .spec.template.spec.containers[0]
      pollingInterval: 30 # Optional. Default: 30 seconds
      cooldownPeriod: 300
      # ^ Optional. Default: 300 seconds
    idleReplicaCount: 0 # Set to 0 to stop all the workers when
      # there is no activity on the queue
    minReplicaCount: 0
      maxReplicaCount: 1
      triggers:
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.bzr.tasks.LoadBazaar"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.cvs.tasks.LoadCvsRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.git.tasks.UpdateGitRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.git.tasks.LoadDiskGitRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.git.tasks.UncompressAndLoadDiskGitRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.mercurial.tasks.LoadArchiveMercurial"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.mercurial.tasks.LoadMercurial"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.svn.tasks.LoadSvnRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.svn.tasks.MountAndLoadSvnRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.svn.tasks.DumpMountAndLoadSvnRepository"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
      - type: rabbitmq
        authenticationRef:
          name: amqp-authentication-loader-save-bulk
        metadata:
          protocol: auto # Optional. Specifies protocol to use,
          # either amqp or http, or auto to
    # autodetect based on the `host` value.
    # Default value is auto.
    mode: QueueLength # QueueLength to trigger on number of msgs in queue
          excludeUnacknowledged: "false" # QueueLength should include unacked messages
          # Implies "http" protocol is used
    value: 10
          queueName: "save-bulk:swh.loader.package.archive.tasks.LoadTarball"
          vhostName: / # Optional. If not specified, use the vhost in the
    # `host` connection string. Alternatively, you can
    # use existing environment variables to read
    # configuration from: See details in "Parameter
    # list" section hostFromEnv: RABBITMQ_HOST%
    # Source: swh/templates/listers/keda-autoscaling.yaml
    apiVersion: keda.sh/v1alpha1
    kind: TriggerAuthentication
    metadata:
      name: amqp-authentication-lister-save-bulk
      namespace: swh-cassandra-next-version
    spec:
      secretTargetRef:
      - parameter: host # "host" is required by the scalerObject trigger metadata
        name: common-secrets
        key: rabbitmq-http-host
    # Source: swh/templates/loaders/keda-autoscaling.yaml
    apiVersion: keda.sh/v1alpha1
    kind: TriggerAuthentication
    metadata:
      name: amqp-authentication-loader-save-bulk
      namespace: swh-cassandra-next-version
    spec:
      secretTargetRef:
      - parameter: host # "host" is required by the scalerObject trigger metadata
        name: common-secrets
        key: rabbitmq-http-host
    
  

data.config.yml.template  (v1/ConfigMap/swh-cassandra-next-version/web-cassandra-configuration-template)
  ± value change in multiline text (one insert, no deletions)
    + - swh.web.save_bulk


spec.template.metadata.annotations.checksum/config  (apps/v1/Deployment/swh-cassandra-next-version/web-cassandra)
  ± value change
    - 70e917c6847598408968b0584e57f21d420311c0a16dc8e0b6595126f1a96972
    + b72b07e9d007779f45f007dd81ca1668aac9578c64fbe59083d3089268435f83



------------- diff for environment production namespace swh -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.g4AQB5iO/production-swh.before, 152 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.g4AQB5iO/production-swh.after, 152 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned no differences
        |___/



------------- diff for environment production namespace swh-cassandra -------------

     _        __  __
   _| |_   _ / _|/ _|  between /tmp/swh-chart.swh.g4AQB5iO/production-swh-cassandra.before, 473 documents
 / _' | | | | |_| |_       and /tmp/swh-chart.swh.g4AQB5iO/production-swh-cassandra.after, 473 documents
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned no differences
        |___/
Edited by Guillaume Samson

Merge request reports

Loading