Refactor the checker stack
A checker configuration must now be created before being able to start a checker session. This configuration is stored in the database and consist in a triplet (datastore, object_type, nb_partitions) Once done, any number of checker can be started for this specific checker configuration; each checher process will check partitions one by one, using the status stored in the database to get the next partition number to check on the next iteration. This allows to dynamically adapt the number of checker processes. For example, checking the shapshots splitting the hash space in 4096 partitions using 4 parallel workers could be like: $ export SWH_CONFIG_FILENAME=config.yml $ swh scrubber check init --object-type snapshot --nb-partitions 4096 --name cfg-snp Created configuration cfg-snp [3] for checking shapshot in postgresql storage $ for i in {1..4}; do (swh scrubber check storage cfg-snp &); done
Showing
- swh/scrubber/cli.py 118 additions, 34 deletionsswh/scrubber/cli.py
- swh/scrubber/db.py 204 additions, 53 deletionsswh/scrubber/db.py
- swh/scrubber/sql/30-schema.sql 4 additions, 2 deletionsswh/scrubber/sql/30-schema.sql
- swh/scrubber/sql/60-indexes.sql 1 addition, 0 deletionsswh/scrubber/sql/60-indexes.sql
- swh/scrubber/sql/upgrades/6.sql 5 additions, 3 deletionsswh/scrubber/sql/upgrades/6.sql
- swh/scrubber/storage_checker.py 91 additions, 90 deletionsswh/scrubber/storage_checker.py
- swh/scrubber/tests/conftest.py 20 additions, 3 deletionsswh/scrubber/tests/conftest.py
- swh/scrubber/tests/storage_checker_tests.py 106 additions, 86 deletionsswh/scrubber/tests/storage_checker_tests.py
- swh/scrubber/tests/test_cli.py 97 additions, 6 deletionsswh/scrubber/tests/test_cli.py
- swh/scrubber/tests/test_db.py 129 additions, 51 deletionsswh/scrubber/tests/test_db.py
Loading
Please register or sign in to comment