Skip to content

production: Increase loader git's default configuration

This impacts the loaders add-forge-now, add-forge-now-slow and loader-large (which are using the loader git within).

Technically, this opens the new "overrides" entry configuration to configure only the full qualified loader's class name. So only the impacted loader will charge such configuration (even when pods are running with multiple other loaders).

In that case, this specifically configures the loader git (with loader-git's 2.4.0 release) to pass extra parameter of pack files (to accept bigger ones) and the threshold memory size (after which it starts using disk instead).

[1] Diff:

$ /var/tmp/diff-chart.sh adapt-loader-configuration-with-overrides
Switched to branch 'production'
Your branch is up to date with 'origin/production'.
generate config in production branch for values/default.yaml
generate config in production branch for values/production-cassandra.yaml
generate config in production branch for values/production.yaml
generate config in production branch for values/staging-cassandra.yaml
generate config in production branch for values/staging.yaml
Switched to branch 'adapt-loader-configuration-with-overrides'
generate config in adapt-loader-configuration-with-overrides branch for values/default.yaml
generate config in adapt-loader-configuration-with-overrides branch for values/production-cassandra.yaml
generate config in adapt-loader-configuration-with-overrides branch for values/production.yaml
generate config in adapt-loader-configuration-with-overrides branch for values/staging-cassandra.yaml
generate config in adapt-loader-configuration-with-overrides branch for values/staging.yaml


------------- diff for values/default.yaml -------------

No diff


------------- diff for values/production-cassandra.yaml -------------

No diff


------------- diff for values/production.yaml -------------

--- /tmp/production.yaml.before 2023-06-07 13:50:40.472606503 +0200
+++ /tmp/production.yaml.after  2023-06-07 13:50:40.784606735 +0200
@@ -304,40 +304,44 @@
   config.yml.template: |
     storage:
       cls: pipeline
       steps:
       - cls: buffer
         min_batch_size:
           content: 1000
           content_bytes: 52428800
           directory: 1000
           directory_entries: 12000
           extid: 1000
           release: 1000
           release_bytes: 52428800
           revision: 1000
           revision_bytes: 52428800
           revision_parents: 2000
       - cls: filter
       - cls: retry
       - cls: remote
         url: http://saam.internal.softwareheritage.org:5002
+    overrides:
+      swh.loader.git.loader.GitLoader:
+        pack_size_bytes: 34359738368
+        temp_file_cutoff: 10737418240

     celery:
       task_broker: amqp://${AMQP_USERNAME}:${AMQP_PASSWORD}@rabbitmq.internal.softwareheritage.org:/
       task_acks_late: true
       task_queues:
       - add_forge_now:swh.loader.git.tasks.UpdateGitRepository
       sentry_settings_for_celery_tasks:
         __sentry-settings-for-celery-tasks__
     metadata_fetcher_credentials:
       __metadata-fetcher-credentials__
   init-container-entrypoint.sh: |
     #!/bin/bash

     set -e

     CONFIG_FILE=/etc/swh/config.yml
     CONFIG_FILE_WIP=/tmp/wip-config.yml

     # substitute environment variables when creating the default config.yml
     eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
@@ -406,40 +410,44 @@
   config.yml.template: |
     storage:
       cls: pipeline
       steps:
       - cls: buffer
         min_batch_size:
           content: 1000
           content_bytes: 52428800
           directory: 1000
           directory_entries: 12000
           extid: 1000
           release: 1000
           release_bytes: 52428800
           revision: 1000
           revision_bytes: 52428800
           revision_parents: 2000
       - cls: filter
       - cls: retry
       - cls: remote
         url: http://saam.internal.softwareheritage.org:5002
+    overrides:
+      swh.loader.git.loader.GitLoader:
+        pack_size_bytes: 34359738368
+        temp_file_cutoff: 10737418240

     celery:
       task_broker: amqp://${AMQP_USERNAME}:${AMQP_PASSWORD}@rabbitmq.internal.softwareheritage.org:/
       task_acks_late: true
       task_queues:
       - add_forge_now_slow:swh.loader.git.tasks.UpdateGitRepository
       sentry_settings_for_celery_tasks:
         __sentry-settings-for-celery-tasks__
     metadata_fetcher_credentials:
       __metadata-fetcher-credentials__
   init-container-entrypoint.sh: |
     #!/bin/bash

     set -e

     CONFIG_FILE=/etc/swh/config.yml
     CONFIG_FILE_WIP=/tmp/wip-config.yml

     # substitute environment variables when creating the default config.yml
     eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \
@@ -610,42 +618,44 @@
   config.yml.template: |
     storage:
       cls: pipeline
       steps:
       - cls: buffer
         min_batch_size:
           content: 1000
           content_bytes: 52428800
           directory: 1000
           directory_entries: 12000
           extid: 1000
           release: 1000
           release_bytes: 52428800
           revision: 1000
           revision_bytes: 52428800
           revision_parents: 2000
       - cls: filter
       - cls: retry
       - cls: remote
         url: http://saam.internal.softwareheritage.org:5002
-    temp_file_cutoff:
-      10737418240
+    overrides:
+      swh.loader.git.loader.GitLoader:
+        pack_size_bytes: 34359738368
+        temp_file_cutoff: 10737418240

     celery:
       task_broker: amqp://${AMQP_USERNAME}:${AMQP_PASSWORD}@rabbitmq.internal.softwareheritage.org:/
       task_acks_late: false
       task_queues:
       - large_repository:swh.loader.git.tasks.UpdateGitRepository
       sentry_settings_for_celery_tasks:
         __sentry-settings-for-celery-tasks__
     metadata_fetcher_credentials:
       __metadata-fetcher-credentials__
   init-container-entrypoint.sh: |
     #!/bin/bash

     set -e

     CONFIG_FILE=/etc/swh/config.yml
     CONFIG_FILE_WIP=/tmp/wip-config.yml

     # substitute environment variables when creating the default config.yml
     eval echo \""$(</etc/swh/configuration-template/config.yml.template)"\" \

...snip... (checksum hash change so meh)...

------------- diff for values/staging-cassandra.yaml -------------

No diff


------------- diff for values/staging.yaml -------------

No diff

Note: /var/tmp/diff-charts.sh from $1576

@gsamson might be interested ;)

Refs. !62 (closed)

Refs. swh/infra/sysadm-environment#4926 (closed)

Edited by Antoine R. Dumont

Merge request reports