Skip to content

scrubber: Adapt configuration for scrubber v2.0

Antoine R. Dumont requested to merge staging-update-scrubber into production

It may be easier to review commit by commit.

This reworks the actual manifest to:

  • call the bootstrap configuration script per object type to create and name the checker configuration in the scrubber db (that script is only called if the configuration is not installed already)
  • adapt the systemd service to only pass the newly scrubber configuration name (each instance of checker will use the configuration to determine which partition to scrub)
  • rename the configuration key from scrubber_db to scrubber

This ^ should be puppet applied only after having upgraded the scrubber package and migrated the db (so first, i'll have to deactivate puppet on scrubber0.staging or scrubber1.production).

octo-diff looks happy enough [1]

Refs. swh/infra/sysadm-environment#4992 (closed)

[1] (only staging but it's similar and longer for prod)

$ $SWH_PUPPET_ENVIRONMENT_HOME/bin/octocatalog-diff --to staging-update-scrubber scrubber0
Found host scrubber0.internal.staging.swh.network
Cloning into '/tmp/swh-ocd.GP7wlUC9/swh-site'...
done.
branch 'staging-update-scrubber' set up to track 'origin/staging-update-scrubber'.
Switched to a new branch 'staging-update-scrubber'
WARN     -> Environment "staging-update-scrubber" contained non-word characters, correcting name to staging_update_scrubber
Cloning into '/tmp/swh-ocd.GP7wlUC9/environments/production/data/private'...
done.
Cloning into '/tmp/swh-ocd.GP7wlUC9/environments/staging_update_scrubber/data/private'...
done.
*** Running octocatalog-diff on host scrubber0.internal.staging.swh.network
I, [2023-07-25T17:10:20.195987 #3600074]  INFO -- : Catalogs compiled for scrubber0.internal.staging.swh.network
I, [2023-07-25T17:10:20.502552 #3600074]  INFO -- : Diffs computed for scrubber0.internal.staging.swh.network
diff origin/production/scrubber0.internal.staging.swh.network current/scrubber0.internal.staging.swh.network
*******************************************
+ Exec[swh-scrubber-config-init-directory] =>
   parameters =>
     "command": [
       "swh scrubber --config-file /etc/softwareheritage/scrubber/storage_primary...
       "check init",
       "--object-type directory",
       "--nb-partitions 16384",
       "--name check-config-directory"
     ],
     "onlyif": [
       "swh scrubber --config-file /etc/softwareheritage/scrubber/storage_primary...
       "check list | grep 'check-config-directory: directory, 16384'"
     ],
     "path": "/usr/local/bin:/usr/bin/:/bin/"
*******************************************
+ Exec[swh-scrubber-config-init-release] =>
   parameters =>
     "command": [
       "swh scrubber --config-file /etc/softwareheritage/scrubber/storage_primary...
       "check init",
       "--object-type release",
       "--nb-partitions 4096",
       "--name check-config-release"
     ],
     "onlyif": [
       "swh scrubber --config-file /etc/softwareheritage/scrubber/storage_primary...
       "check list | grep 'check-config-release: release, 4096'"
     ],
     "path": "/usr/local/bin:/usr/bin/:/bin/"
*******************************************
+ Exec[swh-scrubber-config-init-revision] =>
   parameters =>
     "command": [
       "swh scrubber --config-file /etc/softwareheritage/scrubber/storage_primary...
       "check init",
       "--object-type revision",
       "--nb-partitions 16384",
       "--name check-config-revision"
     ],
     "onlyif": [
       "swh scrubber --config-file /etc/softwareheritage/scrubber/storage_primary...
       "check list | grep 'check-config-revision: revision, 16384'"
     ],
     "path": "/usr/local/bin:/usr/bin/:/bin/"
*******************************************
+ Exec[swh-scrubber-config-init-snapshot] =>
   parameters =>
     "command": [
       "swh scrubber --config-file /etc/softwareheritage/scrubber/storage_primary...
       "check init",
       "--object-type snapshot",
       "--nb-partitions 16384",
       "--name check-config-snapshot"
     ],
     "onlyif": [
       "swh scrubber --config-file /etc/softwareheritage/scrubber/storage_primary...
       "check list | grep 'check-config-snapshot: snapshot, 16384'"
     ],
     "path": "/usr/local/bin:/usr/bin/:/bin/"
*******************************************
  File[/etc/softwareheritage/scrubber/storage_primary.yml] =>
   parameters =>
     content =>
      @@ -1,4 +1,4 @@
       # File managed by puppet - modifications will be lost
      -scrubber_db:
      +scrubber:
         cls: postgresql
         db: host=db1.internal.staging.swh.network port=5432 dbname=swh-scrubber user=swh-scrubber
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-directory-0.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type directory --nb-partitions 1048576 --start-partition-id 0 --end-partition-id 262144"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-directory"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-directory-1.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type directory --nb-partitions 1048576 --start-partition-id 262144 --end-partition-id 524288"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-directory"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-directory-2.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type directory --nb-partitions 1048576 --start-partition-id 524288 --end-partition-id 786432"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-directory"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-directory-3.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type directory --nb-partitions 1048576 --start-partition-id 786432 --end-partition-id 1048576"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-directory"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-release-0.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type release --nb-partitions 1024 --start-partition-id 0 --end-partition-id 256"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-release"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-release-1.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type release --nb-partitions 1024 --start-partition-id 256 --end-partition-id 512"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-release"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-release-2.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type release --nb-partitions 1024 --start-partition-id 512 --end-partition-id 768"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-release"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-release-3.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type release --nb-partitions 1024 --start-partition-id 768 --end-partition-id 1024"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-release"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-revision-0.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type revision --nb-partitions 1048576 --start-partition-id 0 --end-partition-id 262144"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-revision"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-revision-1.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type revision --nb-partitions 1048576 --start-partition-id 262144 --end-partition-id 524288"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-revision"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-revision-2.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type revision --nb-partitions 1048576 --start-partition-id 524288 --end-partition-id 786432"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-revision"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-revision-3.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type revision --nb-partitions 1048576 --start-partition-id 786432 --end-partition-id 1048576"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-revision"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-snapshot-0.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type snapshot --nb-partitions 1048576 --start-partition-id 0 --end-partition-id 262144"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-snapshot"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-snapshot-1.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type snapshot --nb-partitions 1048576 --start-partition-id 262144 --end-partition-id 524288"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-snapshot"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-snapshot-2.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type snapshot --nb-partitions 1048576 --start-partition-id 524288 --end-partition-id 786432"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-snapshot"
*******************************************
  File[/etc/systemd/system/swh-scrubber-checker-postgres@primary-snapshot-3.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type snapshot --nb-partitions 1048576 --start-partition-id 786432 --end-partition-id 1048576"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-snapshot"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-directory-0.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type directory --nb-partitions 1048576 --start-partition-id 0 --end-partition-id 262144"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-directory"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-directory-1.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type directory --nb-partitions 1048576 --start-partition-id 262144 --end-partition-id 524288"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-directory"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-directory-2.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type directory --nb-partitions 1048576 --start-partition-id 524288 --end-partition-id 786432"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-directory"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-directory-3.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type directory --nb-partitions 1048576 --start-partition-id 786432 --end-partition-id 1048576"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-directory"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-release-0.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type release --nb-partitions 1024 --start-partition-id 0 --end-partition-id 256"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-release"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-release-1.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type release --nb-partitions 1024 --start-partition-id 256 --end-partition-id 512"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-release"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-release-2.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type release --nb-partitions 1024 --start-partition-id 512 --end-partition-id 768"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-release"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-release-3.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type release --nb-partitions 1024 --start-partition-id 768 --end-partition-id 1024"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-release"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-revision-0.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type revision --nb-partitions 1048576 --start-partition-id 0 --end-partition-id 262144"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-revision"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-revision-1.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type revision --nb-partitions 1048576 --start-partition-id 262144 --end-partition-id 524288"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-revision"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-revision-2.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type revision --nb-partitions 1048576 --start-partition-id 524288 --end-partition-id 786432"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-revision"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-revision-3.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type revision --nb-partitions 1048576 --start-partition-id 786432 --end-partition-id 1048576"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-revision"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-snapshot-0.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type snapshot --nb-partitions 1048576 --start-partition-id 0 --end-partition-id 262144"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-snapshot"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-snapshot-1.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type snapshot --nb-partitions 1048576 --start-partition-id 262144 --end-partition-id 524288"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-snapshot"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-snapshot-2.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type snapshot --nb-partitions 1048576 --start-partition-id 524288 --end-partition-id 786432"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-snapshot"
*******************************************
  Systemd::Dropin_file[swh-scrubber-checker-postgres@primary-snapshot-3.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -4,3 +4,3 @@
       [Service]
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/storage_primary.yml
      -Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="--object-type snapshot --nb-partitions 1048576 --start-partition-id 786432 --end-partition-id 1048576"
      +Environment=SWH_SCRUBBER_CLI_EXTRA_ARGS="check-config-snapshot"
*******************************************
*** End octocatalog-diff on scrubber0.internal.staging.swh.network
Edited by Antoine R. Dumont

Merge request reports