- Aug 30, 2024
-
-
Antoine Lambert authored
-
Antoine Lambert authored
-
- Aug 27, 2024
-
-
David Douard authored
-
- May 15, 2024
-
-
Pierre-Yves David authored
-
Pierre-Yves David authored
-
- Apr 11, 2024
-
-
Antoine Lambert authored
Promote use of `swh scrubber check run` instead of deprecated commands. Add sections about object storage checker and journal checker.
-
Antoine Lambert authored
Enable to configure and trigger the scrubbing of an object storage with swh-scrubber CLI, either using partitions of contents provided by a storage or by consuming the content kafka topic from a SWH journal (in that case, the --use-journal flag must be provided to "swh check run" command). Related to #4694.
-
Antoine Lambert authored
Instead of reinventing the wheel, prefer to use the check method from the object storage interface for verifying content presence and integrity. Related to #4694.
-
Antoine Lambert authored
Add ObjectStorageCheckerFromJournal class to consume content ids from a kafka topic in order to check their presence in a given object storage but also to check their integrity by fetching their bytes and recomputing checksums. Related to #4694.
-
Antoine Lambert authored
-
Antoine Lambert authored
-
Antoine Lambert authored
-
- Apr 04, 2024
-
-
David Douard authored
Deprecate the former ones.
-
- Apr 02, 2024
-
-
The default behavior of Click is to rewrap text based on the width of the terminal but as a consequence it makes the sample YAML config for the scrubber displayed in command help quite unreadable as indentation is lost. So use \b markers in docstring to ensure proper display of the YAML config by Click.
-
- Mar 29, 2024
-
-
David Douard authored
-
- Mar 25, 2024
-
-
Antoine Lambert authored
Add class ObjectStorageChecker to detect missing and corrupted contents in an object storage. It iterates on content objects referenced in a storage instance, check they are available in a given object storage instance then retrieve their bytes from it in order to recompute checksums and detect corruptions. Related to #4694.
-
Antoine Lambert authored
Promote use of on_eof parameter instead.
-
- Mar 22, 2024
-
-
Antoine Lambert authored
To simplify the future adding of an objstorage checker, extract common code and features of current checkers in abstract base classes. Related to #4694.
-
- Mar 13, 2024
-
-
Antoine Lambert authored
Remove use of --import-mode=importlib pytest option and use new option consider_namespace_packages to fix tests execution with latest pytest release.
-
- Feb 06, 2024
-
-
David Douard authored
The next partition to check, as returned by the checked_partition_iter_next() iterator should never return a partition number exceeding the max number of partitions in the config, nor should it addd this in the database.
-
- Feb 05, 2024
-
-
Antoine Lambert authored
Related to swh/meta#5075.
-
- Feb 02, 2024
-
-
Nicolas Dandrimont authored
-
- Dec 05, 2023
-
-
David Douard authored
-
David Douard authored
-
David Douard authored
-
- Dec 03, 2023
-
-
David Douard authored
-
- Nov 30, 2023
-
-
David Douard authored
- Nov 24, 2023
-
-
Antoine Lambert authored
It now requires a swh-graph server running or connection errors appear. Use swh-graph NaiveClient to avoid spawning a real graph server during the tests.
-
- Oct 17, 2023
-
-
David Douard authored
As well as a command to list partitions being checked. For example: ``` $ swh scrubber check stats snapshot_16 -j { "config": { "name": "snapshot_16", "datastore": { "package": "storage", "cls": "postgresql", "instance": "postgresql:///?service=swh-storage" }, "object_type": "snapshot", "nb_partitions": 65536, "check_hashes": true, "check_references": true }, "min_duration": 0.002196, "max_duration": 0.107398, "avg_duration": 0.005969, "checked_partition": 65536, "running_partition": 0, "missing_object": 0, "missing_object_reference": 0, "corrupt_object": 0 } $ swh scrubber check running cfg1 Running partitions for cfg1 [id=1, type=snapshot]: 0: running since today (20 minutes) ```
-
- Oct 16, 2023
-
-
David Douard authored
init` command
-
- Oct 12, 2023
-
-
David Douard authored
-
David Douard authored
These flags allow to configure a checking session including only one of the 2 possible checks (hash computation and reference validation).
-
David Douard authored
Which allows to remove the dependency on types-pyyaml in [testing] extra.
-
David Douard authored
These tables used to reference the datastore the invalid/missing object was found in, but not keeping the config entry, i.e. the checking session during wich the invalid/missing object was found, which can be an issue when more than one checking session is executed on a given datastore. This replaces the `datastore` field of tables `corrupt_object`, `missing_object` and `missing_object_reference` tables by `config_id`. Adapt all the code accordingly. Note that it changes a bit the cli usage: the kafka checker now needs a config entry, thus a kafka checking session can ony target a given object type (i.e. one kafka topic), The migration script will fill the config_id column for corrupt_object using the check_config entry that matches the oject_type (of corrupt_object) and datastore. For missing_object and missing_object_reference, it will use this later table to idenify the check_config entry corresponding object type for the reference_id and datastore, since it is a checking session on this object type that will generate a missing object entry (which is generaaly not of the same type). For the missing_object table, the config_id will use the one extracted from the missing_object_reference (joining on the missing_id column). Note that the migration script will fail if there are rows in one of these tables for which there exists more than one possible config_entry (i.e. with the same object_type and datastore).
-
- Sep 21, 2023
-
-
David Douard authored
was missing the flake8-bugbear dependency, making effectively the line-too-long check disabled.
-
- Aug 24, 2023
-
-
Antoine R. Dumont authored
Previously, in production, this would retrieve the configuration of the other backend as those configurations are named the same. Refs. #4696
-
Antoine R. Dumont authored
To avoid returning only the first one when multiple configuration with the same name exists for different backend to scrub. Refs. #4696
-
- Jul 26, 2023
-
-
Antoine R. Dumont authored
It's popping up after having run tests.
-
Antoine R. Dumont authored
This was found while deploying the new version.
-