swh/staging: Deploy bulk on-demand archival feature
Related to swh/infra/sysadm-environment#5407 (closed)
These modifications will deploy Bulk On-demand archival feature:
- add web.save_bulk in Django apps;
- create a dedicated lister;
- create a dedicated loader;
- update the scheduler extra-services template;
- create a runner-first-visits scheduler.
Helm-diff
[swh] Comparing changes between branches production and save_bulk_staging (per environment)...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment staging, namespace swh...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra-next-version...
Your branch is up to date with 'origin/save_bulk_staging'.
[swh] Generate config in save_bulk_staging branch for environment staging...
[swh] Generate config in save_bulk_staging branch for environment staging...
[swh] Generate config in save_bulk_staging branch for environment staging...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment production, namespace swh...
[swh] Generate config in production branch for environment production, namespace swh-cassandra...
[swh] Generate config in production branch for environment production, namespace swh-cassandra-next-version...
Your branch is up to date with 'origin/save_bulk_staging'.
[swh] Generate config in save_bulk_staging branch for environment production...
[swh] Generate config in save_bulk_staging branch for environment production...
[swh] Generate config in save_bulk_staging branch for environment production...
------------- diff for environment staging namespace swh -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.g4AQB5iO/staging-swh.before, 141 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.g4AQB5iO/staging-swh.after, 141 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned two differences
|___/
data.config.yml.template (v1/ConfigMap/swh/web-postgresql-configuration-template)
± value change in multiline text (one insert, no deletions)
+ - swh.web.save_bulk
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh/web-postgresql)
± value change
- c09953473b423227b00456b299ab769ec349d9faea4795a45ed6ecd9aaadb825
+ fbe71ea8a1ec70ec4445537dcb8f2a0325fbd6aec65a0a30f6209090ce4cafc0
------------- diff for environment staging namespace swh-cassandra -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.g4AQB5iO/staging-swh-cassandra.before, 451 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.g4AQB5iO/staging-swh-cassandra.after, 460 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned five differences
|___/
(file level)
---
# Source: swh/templates/listers/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: lister-save-bulk-template
namespace: swh-cassandra
data:
config.yml.template: |
storage:
cls: pipeline
steps:
- cls: retry
- cls: remote
url: http://storage-cassandra-read-only-ingress
scheduler:
cls: remote
url: http://scheduler.internal.staging.swh.network
celery:
task_broker: amqp://swhconsumer:${AMQP_PASSWORD}@scheduler0.internal.staging.swh.network:5672/%2f
task_acks_late: true
task_queues:
- swh.lister.save_bulk.tasks.SaveBulkListerTask
sentry_settings_for_celery_tasks:
__sentry-settings-for-celery-tasks__
credentials:
__lister-credentials__
init-container-entrypoint.sh: |
#!/bin/bash
set -e
CONFIG_FILE=/etc/swh/config.yml
CONFIG_FILE_WIP=/tmp/wip-config.yml
# substitute environment variables when creating the default config.yml
eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
> $CONFIG_FILE
SENTRY_SETTINGS_PATH=/etc/credentials/sentry-settings/sentry_settings_for_celery_tasks
if [ -f $SENTRY_SETTINGS_PATH ]; then
awk "/__sentry-settings-for-celery-tasks__/{system(\"sed 's/^/ /g' $SENTRY_SETTINGS_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
mv $CONFIG_FILE_WIP $CONFIG_FILE
else
sed -i 's/__sentry-settings-for-celery-tasks__//g' $CONFIG_FILE
fi
CREDS_LISTER_PATH=/etc/credentials/listers/credentials
if [ -f $CREDS_LISTER_PATH ]; then
awk "/__lister-credentials__/{system(\"sed 's/^/ /g' $CREDS_LISTER_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
mv $CONFIG_FILE_WIP $CONFIG_FILE
else
sed -i 's/__lister-credentials__//g' $CONFIG_FILE
fi
exit 0
logging-configuration.yml: |
version: 1
handlers:
console:
class: logging.StreamHandler
formatter: json
stream: ext://sys.stdout
formatters:
json:
class: pythonjsonlogger.jsonlogger.JsonFormatter
# python-json-logger parses the format argument to get the variables it actually expands into the json
format: "%(asctime)s:%(threadName)s:%(pathname)s:%(lineno)s:%(funcName)s:%(task_name)s:%(task_id)s:%(name)s:%(levelname)s:%(message)s"
loggers:
celery:
level: "INFO"
amqp:
level: WARNING
urllib3:
level: WARNING
azure.core.pipeline.policies.http_logging_policy:
level: WARNING
swh:
level: "INFO"
celery.task:
level: "INFO"
root:
level: "INFO"
handlers:
- console
# Source: swh/templates/loaders/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: loader-save-bulk-template
namespace: swh-cassandra
data:
config.yml.template: |
storage:
cls: pipeline
steps:
- cls: buffer
min_batch_size:
content: 100
content_bytes: 52428800
directory: 100
directory_entries: 500
extid: 100
release: 100
release_bytes: 52428800
revision: 100
revision_bytes: 52428800
revision_parents: 200
- cls: filter
- cls: retry
- cls: remote
url: http://storage-cassandra-read-write-ingress
celery:
task_broker: amqp://swhconsumer:${AMQP_PASSWORD}@scheduler0.internal.staging.swh.network:5672/%2f
task_acks_late: true
task_queues:
- save-bulk:swh.loader.bzr.tasks.LoadBazaar
- save-bulk:swh.loader.cvs.tasks.LoadCvsRepository
- save-bulk:swh.loader.git.tasks.UpdateGitRepository
- save-bulk:swh.loader.git.tasks.LoadDiskGitRepository
- save-bulk:swh.loader.git.tasks.UncompressAndLoadDiskGitRepository
- save-bulk:swh.loader.mercurial.tasks.LoadArchiveMercurial
- save-bulk:swh.loader.mercurial.tasks.LoadMercurial
- save-bulk:swh.loader.svn.tasks.LoadSvnRepository
- save-bulk:swh.loader.svn.tasks.MountAndLoadSvnRepository
- save-bulk:swh.loader.svn.tasks.DumpMountAndLoadSvnRepository
- save-bulk:swh.loader.package.archive.tasks.LoadTarball
sentry_settings_for_celery_tasks:
__sentry-settings-for-celery-tasks__
metadata_fetcher_credentials:
__metadata-fetcher-credentials__
init-container-entrypoint.sh: |
#!/bin/bash
set -e
CONFIG_FILE=/etc/swh/config.yml
CONFIG_FILE_WIP=/tmp/wip-config.yml
# substitute environment variables when creating the default config.yml
eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
> $CONFIG_FILE
SENTRY_SETTINGS_PATH=/etc/credentials/sentry-settings/sentry_settings_for_celery_tasks
if [ -f $SENTRY_SETTINGS_PATH ]; then
awk "/__sentry-settings-for-celery-tasks__/{system(\"sed 's/^/ /g' $SENTRY_SETTINGS_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
mv $CONFIG_FILE_WIP $CONFIG_FILE
else
sed -i 's/__sentry-settings-for-celery-tasks__//g' $CONFIG_FILE
fi
CREDS_LISTER_PATH=/etc/credentials/metadata-fetcher/credentials
if [ -f $CREDS_LISTER_PATH ]; then
awk "/__metadata-fetcher-credentials__/{system(\"sed 's/^/ /g' $CREDS_LISTER_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
mv $CONFIG_FILE_WIP $CONFIG_FILE
else
sed -i 's/__metadata-fetcher-credentials__//g' $CONFIG_FILE
fi
exit 0
logging-configuration.yml: |
version: 1
handlers:
console:
class: logging.StreamHandler
formatter: json
stream: ext://sys.stdout
formatters:
json:
class: pythonjsonlogger.jsonlogger.JsonFormatter
# python-json-logger parses the format argument to get the variables it actually expands into the json
format: "%(asctime)s:%(threadName)s:%(pathname)s:%(lineno)s:%(funcName)s:%(task_name)s:%(task_id)s:%(name)s:%(levelname)s:%(message)s"
loggers:
celery:
level: "INFO"
amqp:
level: WARNING
urllib3:
level: WARNING
azure.core.pipeline.policies.http_logging_policy:
level: WARNING
swh:
level: "INFO"
celery.task:
level: "INFO"
root:
level: "INFO"
handlers:
- console
# Source: swh/templates/listers/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: lister-save-bulk
namespace: swh-cassandra
labels:
app: lister-save-bulk
spec:
revisionHistoryLimit: 2
selector:
matchLabels:
app: lister-save-bulk
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: lister-save-bulk
annotations:
# Force a rollout upgrade if the configuration changes
checksum/config: 7112d1084fbc15ba336705ada811864a97de3f3a2119da390524ef4046c34931
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/lister
operator: In
values:
- "true"
priorityClassName: swh-cassandra-normal-workload
terminationGracePeriodSeconds: 3600
initContainers:
- name: prepare-configuration
image: "debian:bullseye"
imagePullPolicy: IfNotPresent
env:
- name: AMQP_PASSWORD
valueFrom:
secretKeyRef:
key: swhconsumer-password
name: amqp-secrets
optional: false
command:
- /entrypoint.sh
volumeMounts:
- name: configuration-template
mountPath: /entrypoint.sh
subPath: init-container-entrypoint.sh
readOnly: true
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/configuration-template
- name: lister-credentials-secrets
mountPath: /etc/credentials/listers
readOnly: true
- name: sentry-settings-for-celery-tasks
mountPath: /etc/credentials/sentry-settings
readOnly: true
containers:
- name: listers
resources:
requests:
memory: 256Mi
cpu: 250m
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/lister:20241014.2"
imagePullPolicy: IfNotPresent
command:
- /bin/bash
args:
- "-c"
- /opt/swh/entrypoint.sh
lifecycle:
preStop:
exec:
command:
- /pre-stop.sh
env:
- name: STATSD_HOST
value: prometheus-statsd-exporter
- name: STATSD_PORT
value: 9125
- name: STATSD_TAGS
value: "deployment:lister-save-bulk"
- name: MAX_TASKS_PER_CHILD
value: 1
- name: SWH_LOG_LEVEL
value: INFO
- name: SWH_CONFIG_FILENAME
value: /etc/swh/config.yml
- name: SWH_LOG_CONFIG
value: /etc/swh/logging-configuration.yml
- name: SWH_SENTRY_ENVIRONMENT
value: staging
- name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
value: yes
volumeMounts:
- name: lister-utils
mountPath: /pre-stop.sh
subPath: pre-stop.sh
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/logging-configuration.yml
subPath: logging-configuration.yml
readOnly: true
volumes:
- name: configuration
ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: ephemeral-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: local-path
- name: configuration-template
configMap:
name: lister-save-bulk-template
defaultMode: 0777
items:
- key: config.yml.template
path: config.yml.template
- key: init-container-entrypoint.sh
path: init-container-entrypoint.sh
- key: logging-configuration.yml
path: logging-configuration.yml
- name: lister-utils
configMap:
name: lister-utils
defaultMode: 0777
items:
- key: pre-stop-idempotent.sh
path: pre-stop.sh
- name: lister-credentials-secrets
secret:
secretName: lister-credentials-secrets
optional: true
- name: sentry-settings-for-celery-tasks
secret:
secretName: sentry-settings-for-celery-tasks
optional: true
# Source: swh/templates/loaders/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: loader-save-bulk
namespace: swh-cassandra
labels:
app: loader-save-bulk
spec:
revisionHistoryLimit: 2
selector:
matchLabels:
app: loader-save-bulk
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: loader-save-bulk
annotations:
# Force a rollout upgrade if the configuration changes
checksum/config: 99e0abb538a030866ff7ad7209bac7c693de9e96ac792112ab1af36f1b60fd46
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/loader
operator: In
values:
- "true"
priorityClassName: swh-cassandra-normal-workload
terminationGracePeriodSeconds: 3600
dnsConfig:
options:
- name: ndots
value: 1
searches:
- cluster.local
- svc.cluster.local
- swh-cassandra.svc.cluster.local
initContainers:
- name: prepare-configuration
image: "debian:bullseye"
imagePullPolicy: IfNotPresent
env:
- name: AMQP_PASSWORD
valueFrom:
secretKeyRef:
key: swhconsumer-password
name: amqp-secrets
optional: false
command:
- /entrypoint.sh
volumeMounts:
- name: configuration-template
mountPath: /entrypoint.sh
subPath: init-container-entrypoint.sh
readOnly: true
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/configuration-template
- name: metadata-fetcher-credentials
mountPath: /etc/credentials/metadata-fetcher
readOnly: true
- name: sentry-settings-for-celery-tasks
mountPath: /etc/credentials/sentry-settings
readOnly: true
containers:
- name: loaders
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/loader_savecodenow:20241014.1"
imagePullPolicy: IfNotPresent
command:
- /opt/swh/entrypoint.sh
resources:
requests:
memory: 200Mi
cpu: 50m
lifecycle:
preStop:
exec:
command:
- /pre-stop.sh
env:
- name: STATSD_HOST
value: prometheus-statsd-exporter
- name: STATSD_PORT
value: 9125
- name: STATSD_TAGS
value: "deployment:loader-save-bulk"
- name: MAX_TASKS_PER_CHILD
value: 10
- name: SWH_LOG_LEVEL
value: INFO
- name: SWH_CONFIG_FILENAME
value: /etc/swh/config.yml
- name: SWH_LOG_CONFIG
value: /etc/swh/logging-configuration.yml
- name: SWH_SENTRY_ENVIRONMENT
value: staging
- name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
value: yes
volumeMounts:
- name: loader-utils
mountPath: /pre-stop.sh
subPath: pre-stop.sh
- name: configuration
mountPath: /etc/swh
- name: localstorage
mountPath: /tmp
- name: configuration-template
mountPath: /etc/swh/logging-configuration.yml
subPath: logging-configuration.yml
readOnly: true
volumes:
- name: localstorage
ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: ephemeral-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: local-path
- name: configuration
emptyDir: {}
- name: configuration-template
configMap:
name: loader-save-bulk-template
defaultMode: 0777
items:
- key: config.yml.template
path: config.yml.template
- key: init-container-entrypoint.sh
path: init-container-entrypoint.sh
- key: logging-configuration.yml
path: logging-configuration.yml
- name: loader-utils
configMap:
name: loader-utils
defaultMode: 0777
items:
- key: pre-stop-idempotent.sh
path: pre-stop.sh
- name: metadata-fetcher-credentials
secret:
secretName: metadata-fetcher-credentials
optional: true
- name: sentry-settings-for-celery-tasks
secret:
secretName: sentry-settings-for-celery-tasks
optional: true
# Source: swh/templates/scheduler/extra-services-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: swh-cassandra
name: scheduler-runner-first-visits
labels:
app: scheduler-runner-first-visits
spec:
revisionHistoryLimit: 2
replicas: 1
selector:
matchLabels:
app: scheduler-runner-first-visits
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: scheduler-runner-first-visits
annotations:
checksum/config: 4c7048bda6a4e2c34e0e15f9598c1344a57a2a6eb0a3b8c844ce0e70742bce0b
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/scheduler
operator: In
values:
- "true"
priorityClassName: swh-cassandra-frontend-rpc-workload
initContainers:
- name: prepare-configuration
image: "debian:bullseye"
imagePullPolicy: IfNotPresent
command:
- /bin/bash
args:
- "-c"
- "eval echo "\"$(</etc/swh/configuration-template/config.yml.template)\"" > /etc/swh/config.yml"
env:
- name: AMQP_PASSWORD
valueFrom:
secretKeyRef:
key: swhproducer-password
name: amqp-secrets
optional: false
volumeMounts:
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/configuration-template
containers:
- name: scheduler-runner-first-visits
resources:
requests:
memory: 100Mi
cpu: 10m
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/scheduler:20241014.1"
command:
- /opt/swh/entrypoint.sh
args:
- swh
- scheduler
- "--config-file"
- /etc/swh/config.yml
- start-runner-first-visits
- "--period"
- 10
env:
- name: STATSD_HOST
value: prometheus-statsd-exporter
- name: STATSD_PORT
value: 9125
- name: STATSD_TAGS
value: "deployment:scheduler-runner-first-visits"
- name: SWH_CONFIG_FILENAME
value: /etc/swh/config.yml
- name: SWH_LOG_LEVEL
value: INFO
- name: SWH_SENTRY_ENVIRONMENT
value: staging
- name: SWH_MAIN_PACKAGE
value: swh.scheduler
- name: SWH_SENTRY_DSN
valueFrom:
secretKeyRef:
name: scheduler-sentry-secrets
key: sentry-dsn
# if the setting doesn't exist, sentry issue pushes will be disabled
optional: false
- name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
value: "true"
imagePullPolicy: IfNotPresent
volumeMounts:
- name: configuration
mountPath: /etc/swh
volumes:
- name: configuration
emptyDir: {}
- name: configuration-template
configMap:
name: extra-services-configuration-template
items:
- key: config.yml.template
path: config.yml.template
# Source: swh/templates/listers/keda-autoscaling.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: lister-save-bulk-operators
namespace: swh-cassandra
spec:
scaleTargetRef:
apiVersion: apps/v1 # Optional. Default: apps/v1
kind: Deployment # Optional. Default: Deployment
# Mandatory. Must be in same namespace as ScaledObject
name: lister-save-bulk
# envSourceContainerName: {container-name} # Optional. Default:
# .spec.template.spec.containers[0]
pollingInterval: 30 # Optional. Default: 30 seconds
cooldownPeriod: 3600
# ^ Optional. Default: 300 seconds
idleReplicaCount: 0 # Set to 0 to stop all the workers when
# there is no activity on the queue
minReplicaCount: 0
maxReplicaCount: 1
triggers:
- type: rabbitmq
authenticationRef:
name: amqp-authentication-lister-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 1
queueName: swh.lister.save_bulk.tasks.SaveBulkListerTask
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
# Source: swh/templates/loaders/keda-autoscaling.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: loader-save-bulk-operators
namespace: swh-cassandra
spec:
scaleTargetRef:
apiVersion: apps/v1 # Optional. Default: apps/v1
kind: Deployment # Optional. Default: Deployment
# Mandatory. Must be in same namespace as ScaledObject
name: loader-save-bulk
# envSourceContainerName: {container-name} # Optional. Default:
# .spec.template.spec.containers[0]
pollingInterval: 30 # Optional. Default: 30 seconds
cooldownPeriod: 300
# ^ Optional. Default: 300 seconds
idleReplicaCount: 0 # Set to 0 to stop all the workers when
# there is no activity on the queue
minReplicaCount: 0
maxReplicaCount: 1
triggers:
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.bzr.tasks.LoadBazaar"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.cvs.tasks.LoadCvsRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.git.tasks.UpdateGitRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.git.tasks.LoadDiskGitRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.git.tasks.UncompressAndLoadDiskGitRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.mercurial.tasks.LoadArchiveMercurial"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.mercurial.tasks.LoadMercurial"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.svn.tasks.LoadSvnRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.svn.tasks.MountAndLoadSvnRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.svn.tasks.DumpMountAndLoadSvnRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.package.archive.tasks.LoadTarball"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
# Source: swh/templates/listers/keda-autoscaling.yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: amqp-authentication-lister-save-bulk
namespace: swh-cassandra
spec:
secretTargetRef:
- parameter: host # "host" is required by the scalerObject trigger metadata
name: common-secrets
key: rabbitmq-http-host
# Source: swh/templates/loaders/keda-autoscaling.yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: amqp-authentication-loader-save-bulk
namespace: swh-cassandra
spec:
secretTargetRef:
- parameter: host # "host" is required by the scalerObject trigger metadata
name: common-secrets
key: rabbitmq-http-host
data.config.yml.template (v1/ConfigMap/swh-cassandra/web-cassandra-configuration-template)
± value change in multiline text (one insert, no deletions)
+ - swh.web.save_bulk
data.config.yml.template (v1/ConfigMap/swh-cassandra/web-webhooks-configuration-template)
± value change in multiline text (one insert, no deletions)
+ - swh.web.save_bulk
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh-cassandra/web-cassandra)
± value change
- 63f91d0d954ad4733deadfe303d8a447a3be38dcf31d5c035b6d30ddccc42934
+ 256e0e2a2a2069b1c01681de89ea1778456dac1cbdb757bd4daa2cdba1f78a12
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh-cassandra/web-webhooks)
± value change
- 273830564690f1c2225558337f98e73c7003c012612c807bc1a7d32d9340e752
+ 6611b6d5b581451b710773f38a8562ae154b340589d6d871b478ea131091a23a
------------- diff for environment staging namespace swh-cassandra-next-version -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.g4AQB5iO/staging-swh-cassandra-next-version.before, 360 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.g4AQB5iO/staging-swh-cassandra-next-version.after, 369 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned three differences
|___/
(file level)
---
# Source: swh/templates/listers/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: lister-save-bulk-template
namespace: swh-cassandra-next-version
data:
config.yml.template: |
storage:
cls: pipeline
steps:
- cls: retry
- cls: remote
url: http://storage-ro-postgresql:5002
scheduler:
cls: remote
url: http://scheduler-rpc:5008
celery:
task_broker: amqp://${AMQP_USERNAME}:${AMQP_PASSWORD}@rabbitmq-scheduler:5672/%2f
task_acks_late: true
task_queues:
- swh.lister.save_bulk.tasks.SaveBulkListerTask
sentry_settings_for_celery_tasks:
__sentry-settings-for-celery-tasks__
credentials:
__lister-credentials__
init-container-entrypoint.sh: |
#!/bin/bash
set -e
CONFIG_FILE=/etc/swh/config.yml
CONFIG_FILE_WIP=/tmp/wip-config.yml
# substitute environment variables when creating the default config.yml
eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
> $CONFIG_FILE
SENTRY_SETTINGS_PATH=/etc/credentials/sentry-settings/sentry_settings_for_celery_tasks
if [ -f $SENTRY_SETTINGS_PATH ]; then
awk "/__sentry-settings-for-celery-tasks__/{system(\"sed 's/^/ /g' $SENTRY_SETTINGS_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
mv $CONFIG_FILE_WIP $CONFIG_FILE
else
sed -i 's/__sentry-settings-for-celery-tasks__//g' $CONFIG_FILE
fi
CREDS_LISTER_PATH=/etc/credentials/listers/credentials
if [ -f $CREDS_LISTER_PATH ]; then
awk "/__lister-credentials__/{system(\"sed 's/^/ /g' $CREDS_LISTER_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
mv $CONFIG_FILE_WIP $CONFIG_FILE
else
sed -i 's/__lister-credentials__//g' $CONFIG_FILE
fi
exit 0
logging-configuration.yml: |
version: 1
handlers:
console:
class: logging.StreamHandler
formatter: json
stream: ext://sys.stdout
formatters:
json:
class: pythonjsonlogger.jsonlogger.JsonFormatter
# python-json-logger parses the format argument to get the variables it actually expands into the json
format: "%(asctime)s:%(threadName)s:%(pathname)s:%(lineno)s:%(funcName)s:%(task_name)s:%(task_id)s:%(name)s:%(levelname)s:%(message)s"
loggers:
celery:
level: "INFO"
amqp:
level: WARNING
urllib3:
level: WARNING
azure.core.pipeline.policies.http_logging_policy:
level: WARNING
swh:
level: "INFO"
celery.task:
level: "INFO"
root:
level: "INFO"
handlers:
- console
# Source: swh/templates/loaders/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: loader-save-bulk-template
namespace: swh-cassandra-next-version
data:
config.yml.template: |
storage:
cls: pipeline
steps:
- cls: buffer
min_batch_size:
content: 100
content_bytes: 52428800
directory: 100
directory_entries: 500
extid: 100
release: 100
release_bytes: 52428800
revision: 100
revision_bytes: 52428800
revision_parents: 200
- cls: filter
- cls: retry
- cls: remote
url: http://storage-rw-cassandra:5002
celery:
task_broker: amqp://${AMQP_USERNAME}:${AMQP_PASSWORD}@rabbitmq-scheduler:5672/%2f
task_acks_late: true
task_queues:
- save-bulk:swh.loader.bzr.tasks.LoadBazaar
- save-bulk:swh.loader.cvs.tasks.LoadCvsRepository
- save-bulk:swh.loader.git.tasks.UpdateGitRepository
- save-bulk:swh.loader.git.tasks.LoadDiskGitRepository
- save-bulk:swh.loader.git.tasks.UncompressAndLoadDiskGitRepository
- save-bulk:swh.loader.mercurial.tasks.LoadArchiveMercurial
- save-bulk:swh.loader.mercurial.tasks.LoadMercurial
- save-bulk:swh.loader.svn.tasks.LoadSvnRepository
- save-bulk:swh.loader.svn.tasks.MountAndLoadSvnRepository
- save-bulk:swh.loader.svn.tasks.DumpMountAndLoadSvnRepository
- save-bulk:swh.loader.package.archive.tasks.LoadTarball
sentry_settings_for_celery_tasks:
__sentry-settings-for-celery-tasks__
metadata_fetcher_credentials:
__metadata-fetcher-credentials__
init-container-entrypoint.sh: |
#!/bin/bash
set -e
CONFIG_FILE=/etc/swh/config.yml
CONFIG_FILE_WIP=/tmp/wip-config.yml
# substitute environment variables when creating the default config.yml
eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
> $CONFIG_FILE
SENTRY_SETTINGS_PATH=/etc/credentials/sentry-settings/sentry_settings_for_celery_tasks
if [ -f $SENTRY_SETTINGS_PATH ]; then
awk "/__sentry-settings-for-celery-tasks__/{system(\"sed 's/^/ /g' $SENTRY_SETTINGS_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
mv $CONFIG_FILE_WIP $CONFIG_FILE
else
sed -i 's/__sentry-settings-for-celery-tasks__//g' $CONFIG_FILE
fi
CREDS_LISTER_PATH=/etc/credentials/metadata-fetcher/credentials
if [ -f $CREDS_LISTER_PATH ]; then
awk "/__metadata-fetcher-credentials__/{system(\"sed 's/^/ /g' $CREDS_LISTER_PATH\");next}1" $CONFIG_FILE > $CONFIG_FILE_WIP
mv $CONFIG_FILE_WIP $CONFIG_FILE
else
sed -i 's/__metadata-fetcher-credentials__//g' $CONFIG_FILE
fi
exit 0
logging-configuration.yml: |
version: 1
handlers:
console:
class: logging.StreamHandler
formatter: json
stream: ext://sys.stdout
formatters:
json:
class: pythonjsonlogger.jsonlogger.JsonFormatter
# python-json-logger parses the format argument to get the variables it actually expands into the json
format: "%(asctime)s:%(threadName)s:%(pathname)s:%(lineno)s:%(funcName)s:%(task_name)s:%(task_id)s:%(name)s:%(levelname)s:%(message)s"
loggers:
celery:
level: "INFO"
amqp:
level: WARNING
urllib3:
level: WARNING
azure.core.pipeline.policies.http_logging_policy:
level: WARNING
swh:
level: "INFO"
celery.task:
level: "INFO"
root:
level: "INFO"
handlers:
- console
# Source: swh/templates/listers/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: lister-save-bulk
namespace: swh-cassandra-next-version
labels:
app: lister-save-bulk
spec:
revisionHistoryLimit: 2
selector:
matchLabels:
app: lister-save-bulk
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: lister-save-bulk
annotations:
# Force a rollout upgrade if the configuration changes
checksum/config: c1b16646658edcbaf1d42fc623656c7b6f3520f3d2927a3e4a302eb3314676c9
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/lister
operator: In
values:
- "true"
priorityClassName: swh-cassandra-next-version-normal-workload
terminationGracePeriodSeconds: 3600
initContainers:
- name: prepare-configuration
image: "debian:bullseye"
imagePullPolicy: IfNotPresent
env:
- name: AMQP_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: rabbitmq-scheduler-secret
optional: false
- name: AMQP_USERNAME
valueFrom:
secretKeyRef:
key: username
name: rabbitmq-scheduler-secret
optional: false
command:
- /entrypoint.sh
volumeMounts:
- name: configuration-template
mountPath: /entrypoint.sh
subPath: init-container-entrypoint.sh
readOnly: true
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/configuration-template
- name: lister-credentials-secrets
mountPath: /etc/credentials/listers
readOnly: true
- name: sentry-settings-for-celery-tasks
mountPath: /etc/credentials/sentry-settings
readOnly: true
containers:
- name: listers
resources:
requests:
memory: 256Mi
cpu: 250m
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/lister:20241014.2"
imagePullPolicy: IfNotPresent
command:
- /bin/bash
args:
- "-c"
- /opt/swh/entrypoint.sh
lifecycle:
preStop:
exec:
command:
- /pre-stop.sh
env:
- name: STATSD_HOST
value: prometheus-statsd-exporter
- name: STATSD_PORT
value: 9125
- name: STATSD_TAGS
value: "deployment:lister-save-bulk"
- name: MAX_TASKS_PER_CHILD
value: 1
- name: SWH_LOG_LEVEL
value: INFO
- name: SWH_CONFIG_FILENAME
value: /etc/swh/config.yml
- name: SWH_LOG_CONFIG
value: /etc/swh/logging-configuration.yml
- name: SWH_SENTRY_ENVIRONMENT
value: staging
- name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
value: yes
volumeMounts:
- name: lister-utils
mountPath: /pre-stop.sh
subPath: pre-stop.sh
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/logging-configuration.yml
subPath: logging-configuration.yml
readOnly: true
volumes:
- name: configuration
ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: ephemeral-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: local-path
- name: configuration-template
configMap:
name: lister-save-bulk-template
defaultMode: 0777
items:
- key: config.yml.template
path: config.yml.template
- key: init-container-entrypoint.sh
path: init-container-entrypoint.sh
- key: logging-configuration.yml
path: logging-configuration.yml
- name: lister-utils
configMap:
name: lister-utils
defaultMode: 0777
items:
- key: pre-stop-idempotent.sh
path: pre-stop.sh
- name: lister-credentials-secrets
secret:
secretName: lister-credentials-secrets
optional: true
- name: sentry-settings-for-celery-tasks
secret:
secretName: sentry-settings-for-celery-tasks
optional: true
# Source: swh/templates/loaders/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: loader-save-bulk
namespace: swh-cassandra-next-version
labels:
app: loader-save-bulk
spec:
revisionHistoryLimit: 2
selector:
matchLabels:
app: loader-save-bulk
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: loader-save-bulk
annotations:
# Force a rollout upgrade if the configuration changes
checksum/config: 52ca5c06adc7643d075a12667ffaa81fda770a53e4854e07283fcaf84d96d7fc
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/loader
operator: In
values:
- "true"
priorityClassName: swh-cassandra-next-version-normal-workload
terminationGracePeriodSeconds: 60
dnsConfig:
options:
- name: ndots
value: 1
searches:
- cluster.local
- svc.cluster.local
- swh-cassandra-next-version.svc.cluster.local
initContainers:
- name: prepare-configuration
image: "debian:bullseye"
imagePullPolicy: IfNotPresent
env:
- name: AMQP_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: rabbitmq-scheduler-secret
optional: false
- name: AMQP_USERNAME
valueFrom:
secretKeyRef:
key: username
name: rabbitmq-scheduler-secret
optional: false
command:
- /entrypoint.sh
volumeMounts:
- name: configuration-template
mountPath: /entrypoint.sh
subPath: init-container-entrypoint.sh
readOnly: true
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/configuration-template
- name: metadata-fetcher-credentials
mountPath: /etc/credentials/metadata-fetcher
readOnly: true
- name: sentry-settings-for-celery-tasks
mountPath: /etc/credentials/sentry-settings
readOnly: true
containers:
- name: loaders
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/loader_savecodenow:20241014.1"
imagePullPolicy: IfNotPresent
command:
- /opt/swh/entrypoint.sh
resources:
requests:
memory: 200Mi
cpu: 50m
lifecycle:
preStop:
exec:
command:
- /pre-stop.sh
env:
- name: STATSD_HOST
value: prometheus-statsd-exporter
- name: STATSD_PORT
value: 9125
- name: STATSD_TAGS
value: "deployment:loader-save-bulk"
- name: MAX_TASKS_PER_CHILD
value: 10
- name: SWH_LOG_LEVEL
value: INFO
- name: SWH_CONFIG_FILENAME
value: /etc/swh/config.yml
- name: SWH_LOG_CONFIG
value: /etc/swh/logging-configuration.yml
- name: SWH_SENTRY_ENVIRONMENT
value: staging
- name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
value: yes
volumeMounts:
- name: loader-utils
mountPath: /pre-stop.sh
subPath: pre-stop.sh
- name: configuration
mountPath: /etc/swh
- name: localstorage
mountPath: /tmp
- name: configuration-template
mountPath: /etc/swh/logging-configuration.yml
subPath: logging-configuration.yml
readOnly: true
volumes:
- name: localstorage
ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: ephemeral-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: local-path
- name: configuration
emptyDir: {}
- name: configuration-template
configMap:
name: loader-save-bulk-template
defaultMode: 0777
items:
- key: config.yml.template
path: config.yml.template
- key: init-container-entrypoint.sh
path: init-container-entrypoint.sh
- key: logging-configuration.yml
path: logging-configuration.yml
- name: loader-utils
configMap:
name: loader-utils
defaultMode: 0777
items:
- key: pre-stop-idempotent.sh
path: pre-stop.sh
- name: metadata-fetcher-credentials
secret:
secretName: metadata-fetcher-credentials
optional: true
- name: sentry-settings-for-celery-tasks
secret:
secretName: sentry-settings-for-celery-tasks
optional: true
# Source: swh/templates/scheduler/extra-services-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: swh-cassandra-next-version
name: scheduler-runner-first-visits
labels:
app: scheduler-runner-first-visits
spec:
revisionHistoryLimit: 2
replicas: 1
selector:
matchLabels:
app: scheduler-runner-first-visits
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: scheduler-runner-first-visits
annotations:
checksum/config: 63de62e998530279a875c5b3d9fd3628ab2989d523a1a56468129cc8f3f31507
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/scheduler
operator: In
values:
- "true"
priorityClassName: swh-cassandra-next-version-frontend-rpc-workload
initContainers:
- name: prepare-configuration
image: "debian:bullseye"
imagePullPolicy: IfNotPresent
command:
- /bin/bash
args:
- "-c"
- "eval echo "\"$(</etc/swh/configuration-template/config.yml.template)\"" > /etc/swh/config.yml"
env:
- name: AMQP_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: rabbitmq-scheduler-secret
optional: false
- name: AMQP_USERNAME
valueFrom:
secretKeyRef:
key: username
name: rabbitmq-scheduler-secret
optional: false
volumeMounts:
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/configuration-template
containers:
- name: scheduler-runner-first-visits
resources:
requests:
memory: 100Mi
cpu: 10m
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/scheduler:20241014.1"
command:
- /opt/swh/entrypoint.sh
args:
- swh
- scheduler
- "--config-file"
- /etc/swh/config.yml
- start-runner-first-visits
- "--period"
- 10
env:
- name: STATSD_HOST
value: prometheus-statsd-exporter
- name: STATSD_PORT
value: 9125
- name: STATSD_TAGS
value: "deployment:scheduler-runner-first-visits"
- name: SWH_CONFIG_FILENAME
value: /etc/swh/config.yml
- name: SWH_LOG_LEVEL
value: INFO
imagePullPolicy: IfNotPresent
volumeMounts:
- name: configuration
mountPath: /etc/swh
volumes:
- name: configuration
emptyDir: {}
- name: configuration-template
configMap:
name: extra-services-configuration-template
items:
- key: config.yml.template
path: config.yml.template
# Source: swh/templates/listers/keda-autoscaling.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: lister-save-bulk-operators
namespace: swh-cassandra-next-version
spec:
scaleTargetRef:
apiVersion: apps/v1 # Optional. Default: apps/v1
kind: Deployment # Optional. Default: Deployment
# Mandatory. Must be in same namespace as ScaledObject
name: lister-save-bulk
# envSourceContainerName: {container-name} # Optional. Default:
# .spec.template.spec.containers[0]
pollingInterval: 30 # Optional. Default: 30 seconds
cooldownPeriod: 3600
# ^ Optional. Default: 300 seconds
idleReplicaCount: 0 # Set to 0 to stop all the workers when
# there is no activity on the queue
minReplicaCount: 0
maxReplicaCount: 1
triggers:
- type: rabbitmq
authenticationRef:
name: amqp-authentication-lister-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 1
queueName: swh.lister.save_bulk.tasks.SaveBulkListerTask
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
# Source: swh/templates/loaders/keda-autoscaling.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: loader-save-bulk-operators
namespace: swh-cassandra-next-version
spec:
scaleTargetRef:
apiVersion: apps/v1 # Optional. Default: apps/v1
kind: Deployment # Optional. Default: Deployment
# Mandatory. Must be in same namespace as ScaledObject
name: loader-save-bulk
# envSourceContainerName: {container-name} # Optional. Default:
# .spec.template.spec.containers[0]
pollingInterval: 30 # Optional. Default: 30 seconds
cooldownPeriod: 300
# ^ Optional. Default: 300 seconds
idleReplicaCount: 0 # Set to 0 to stop all the workers when
# there is no activity on the queue
minReplicaCount: 0
maxReplicaCount: 1
triggers:
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.bzr.tasks.LoadBazaar"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.cvs.tasks.LoadCvsRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.git.tasks.UpdateGitRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.git.tasks.LoadDiskGitRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.git.tasks.UncompressAndLoadDiskGitRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.mercurial.tasks.LoadArchiveMercurial"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.mercurial.tasks.LoadMercurial"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.svn.tasks.LoadSvnRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.svn.tasks.MountAndLoadSvnRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.svn.tasks.DumpMountAndLoadSvnRepository"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
- type: rabbitmq
authenticationRef:
name: amqp-authentication-loader-save-bulk
metadata:
protocol: auto # Optional. Specifies protocol to use,
# either amqp or http, or auto to
# autodetect based on the `host` value.
# Default value is auto.
mode: QueueLength # QueueLength to trigger on number of msgs in queue
excludeUnacknowledged: "false" # QueueLength should include unacked messages
# Implies "http" protocol is used
value: 10
queueName: "save-bulk:swh.loader.package.archive.tasks.LoadTarball"
vhostName: / # Optional. If not specified, use the vhost in the
# `host` connection string. Alternatively, you can
# use existing environment variables to read
# configuration from: See details in "Parameter
# list" section hostFromEnv: RABBITMQ_HOST%
# Source: swh/templates/listers/keda-autoscaling.yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: amqp-authentication-lister-save-bulk
namespace: swh-cassandra-next-version
spec:
secretTargetRef:
- parameter: host # "host" is required by the scalerObject trigger metadata
name: common-secrets
key: rabbitmq-http-host
# Source: swh/templates/loaders/keda-autoscaling.yaml
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: amqp-authentication-loader-save-bulk
namespace: swh-cassandra-next-version
spec:
secretTargetRef:
- parameter: host # "host" is required by the scalerObject trigger metadata
name: common-secrets
key: rabbitmq-http-host
data.config.yml.template (v1/ConfigMap/swh-cassandra-next-version/web-cassandra-configuration-template)
± value change in multiline text (one insert, no deletions)
+ - swh.web.save_bulk
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh-cassandra-next-version/web-cassandra)
± value change
- 70e917c6847598408968b0584e57f21d420311c0a16dc8e0b6595126f1a96972
+ b72b07e9d007779f45f007dd81ca1668aac9578c64fbe59083d3089268435f83
------------- diff for environment production namespace swh -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.g4AQB5iO/production-swh.before, 152 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.g4AQB5iO/production-swh.after, 152 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned no differences
|___/
------------- diff for environment production namespace swh-cassandra -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.g4AQB5iO/production-swh-cassandra.before, 473 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.g4AQB5iO/production-swh-cassandra.after, 473 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned no differences
|___/
Edited by Guillaume Samson