Open
Milestone
Regularly scrub journal, storage, and objstorage [Roadmap - Preserve]
- Lead: vlorentz
- Priority: medium
- Effort: ??
Description:
Set up background jobs to regularly check - and repair when necessary - data validity, in all SWH data stores. This includes both blobs (swh-objstorage) and other graph objects (swh-storage) on all the copies (in-house, kafka, azure, upcoming mirrors, etc.)
Includes work:
- Implement storage scrubber for Cassandra
- Add scrubbing for the object storage
- Add metrics and Grafana dashboard for scrubbing process
- Automatically repair and recover objects found to be invalid
KPIs:
- List of scrubbers deployed in production
- Monitoring tools deployed in production
- Rolling report of operations per datastore including errors found and fixed at each iteration
Unstarted Issues (open and unassigned)
12
- Meta · Missing content on S3
- swh-scrubber · Max partition count not respected
- swh-scrubber · Fail to initialize a scrubber check configuration
- Meta · Some contents on S3 are full of null bytes
- swh-environment · [docker] Add the deployment of swh-scrubber
- swh-storage · cassandra: *_get_partition pagination does not handle Murmur3 collisions
- swh-scrubber · Check integrity of directories, revisions, and releases
- Meta · Deploy repair tools
- Meta · Automatically repair and recover objects found to be invalid
- Meta · Add scrubbing for the object storage
- Meta · Add metrics and Grafana dashboard for scrubbing process
- Meta · regularly scrub all the data stores of swh
Ongoing Issues (open and assigned)
2
Completed Issues (closed)
16
- sysadm-environment · Deploy the storage scrubber in elastic production
- sysadm-environment · Deploy the storage scrubbers in elastic staging
- swh-scrubber · ValueError: swh:1:dir:18025f166aa970fa5dc4e4e1adf0adcdc5fa1ecf has duplicated entry name: b'_posts'
- sysadm-environment · Deploy the journal scrubber in production
- sysadm-environment · Deploy the journal scrubbers in staging
- swh-scrubber · Improve the config-id parameter support
- sysadm-environment · Deploy scrubber v2.0
- sysadm-environment · [scrubber] Deploy the new version of the storage to fix the duplicate entry error in the storage
- sysadm-environment · Deploy swh-scrubber 1.0.1
- swh-storage · Generalize content_get_partition to other object types
- swh-scrubber · Adapt DB schema to support cassandra token (also, future-proofing to support sha1/sha256 when to scrub to objstorage?)
- swh-scrubber · Implement cassandra scrubber in storage_checker.py
- swh-export · Document using luigi to generate datasets
- swh-scrubber · Implement storage scrubber for cassandra
- swh-scrubber · Publish scrubber metrics and create grafana dashboard
- Live Database Audit · Check integrity of directories, revisions, and releases
Loading
Loading
Loading