Deploy swh-scrubber v0.1.1
Add checkpointing on storage_checker to avoid rechecking objects at the beginning of ranges again and again.
Staging:
-
apply swh/scrubber/sql/upgrades/4.sql [1] -
upgrade package on workers and stop all workers -
start one worker with --log-level swh.scrubber.storage_checker:DEBUG [2] -
wait for a couple of Processing %s range %s to %s lines [2] -
restart it (still with debug logs) [3] -
check it is not processing the same ranges [3] -
restart all workers (without debug logs)
Production:
-
apply swh/scrubber/sql/upgrades/4.sql [4] -
upgrade package on workers -
restart all workers
[1]
swhworker@scrubber0:~$ swh db --config-file /etc/softwareheritage/scrubber/primary.yml upgrade scrubber --module-config-key=scrubber_db
INFO:swh.core.db.db_utils:Executing migration script '/usr/lib/python3/dist-packages/swh/scrubber/sql/upgrades/4.sql'
Migration to version 4 done
[2]
swhworker@scrubber0:~$ export SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/primary.yml
swhworker@scrubber0:~$ swh --log-level swh.scrubber.storage_checker:DEBUG scrubber check storage --object-type directory --start-object 0000000000000000000000000000000000000000 --end-object 3fffffffffffffffffffffffffffffffffffffff
DEBUG:swh.scrubber.storage_checker:Processing directory range None to 000001
DEBUG:swh.scrubber.storage_checker:Processing directory range 000001 to 000002
DEBUG:swh.scrubber.storage_checker:Processing directory range 000002 to 000003
DEBUG:swh.scrubber.storage_checker:Processing directory range 000003 to 000004
DEBUG:swh.scrubber.storage_checker:Processing directory range 000004 to 000005
[3]
swhworker@scrubber0:~$ swh --log-level swh.scrubber.storage_checker:DEBUG scrubber check storage --object-type directory --start-object 0000000000000000000000000000000000000000 --end-object 3fffffffffffffffffffffffffffffffffffffff
DEBUG:swh.scrubber.storage_checker:Skipping processing of directory range None to 000001: already done at 2022-10-18 08:32:42.926663+00:00
DEBUG:swh.scrubber.storage_checker:Skipping processing of directory range 000001 to 000002: already done at 2022-10-18 08:32:49.098090+00:00
DEBUG:swh.scrubber.storage_checker:Skipping processing of directory range 000002 to 000003: already done at 2022-10-18 08:32:57.651759+00:00
DEBUG:swh.scrubber.storage_checker:Skipping processing of directory range 000003 to 000004: already done at 2022-10-18 08:33:11.836088+00:00
DEBUG:swh.scrubber.storage_checker:Processing directory range 000004 to 000005
[4]
swhworker@scrubber1:~$ swh db --config-file /etc/softwareheritage/scrubber/primary.yml upgrade scrubber --module-config-key=scrubber_db
INFO:swh.core.db.db_utils:Executing migration script '/usr/lib/python3/dist-packages/swh/scrubber/sql/upgrades/4.sql'
Migration to version 4 done
Edited by Antoine R. Dumont