Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • vlorentz/swh-scrubber
  • lunar/swh-scrubber
  • anlambert/swh-scrubber
  • swh/devel/swh-scrubber
  • olasd/swh-scrubber
  • douardda/swh-scrubber
  • ardumont/swh-scrubber
  • marmoute/swh-scrubber
8 results
Show changes
Commits on Source (2)
Metadata-Version: 2.1
Name: swh.scrubber
Version: 0.0.1
Summary: Software Heritage Datastore Scrubber
Home-page: https://forge.softwareheritage.org/diffusion/swh-scrubber
Author: Software Heritage developers
Author-email: swh-devel@inria.fr
License: UNKNOWN
Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest
Project-URL: Funding, https://www.softwareheritage.org/donate
Project-URL: Source, https://forge.softwareheritage.org/source/swh-scrubber
Project-URL: Documentation, https://docs.softwareheritage.org/devel/swh-scrubber/
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Requires-Python: >=3.7
Description-Content-Type: text/x-rst
Provides-Extra: testing
License-File: LICENSE
License-File: AUTHORS
Software Heritage - Datastore Scrubber
======================================
Tools to periodically checks data integrity in swh-storage and swh-objstorage,
reports errors, and (try to) fix them.
This is a work in progress; some of the components described below do not
exist yet (cassandra storage checker, objstorage checker, recovery, and reinjection)
The Scrubber package is made of the following parts:
Checking
--------
Highly parallel processes continuously read objects from a data store,
compute checksums, and write any failure in a database, along with the data of
the corrupt object.
There is one "checker" for each datastore package: storage (postgresql and cassandra),
journal (kafka), and objstorage.
Recovery
--------
Then, from time to time, jobs go through the list of known corrupt objects,
and try to recover the original objects, through various means:
* Brute-forcing variations until they match their checksum
* Recovering from another data store
* As a last resort, recovering from known origins, if any
Reinjection
-----------
Finally, when an original object is recovered, it is reinjected in the original
data store, replacing the corrupt one.
docs/README.rst
\ No newline at end of file
Software Heritage - Datastore Scrubber
======================================
Tools to periodically checks data integrity in swh-storage and swh-objstorage,
reports errors, and (try to) fix them.
This is a work in progress; some of the components described below do not
exist yet (cassandra storage checker, objstorage checker, recovery, and reinjection)
The Scrubber package is made of the following parts:
Checking
--------
Highly parallel processes continuously read objects from a data store,
compute checksums, and write any failure in a database, along with the data of
the corrupt object.
There is one "checker" for each datastore package: storage (postgresql and cassandra),
journal (kafka), and objstorage.
Recovery
--------
Then, from time to time, jobs go through the list of known corrupt objects,
and try to recover the original objects, through various means:
* Brute-forcing variations until they match their checksum
* Recovering from another data store
* As a last resort, recovering from known origins, if any
Reinjection
-----------
Finally, when an original object is recovered, it is reinjected in the original
data store, replacing the corrupt one.
swh-scrubber (0.0.1-1~swh1) unstable-swh; urgency=medium
* Initial release
-- Nicolas Dandrimont <nicolas@dandrimont.eu> Thu, 31 Mar 2022 19:29:54 +0200
Source: swh-scrubber
Maintainer: Software Heritage developers <swh-devel@inria.fr>
Section: python
Priority: optional
Build-Depends:
debhelper-compat (= 13),
dh-python (>= 3),
python3-all,
python3-pytest (<< 7.0.0),
python3-pytest-mock,
python3-pytest-postgresql,
python3-setuptools,
python3-setuptools-scm,
python3-swh.core (>= 0.3),
python3-swh.core.db.pytestplugin,
python3-swh.graph.client,
python3-swh.journal (>= 0.9.0),
python3-swh.model (>= 5.0.0),
python3-swh.storage (>= 1.1.0),
python3-yaml
Rules-Requires-Root: no
Standards-Version: 4.6.0
Homepage: https://forge.softwareheritage.org/source/swh-scrubber
Package: python3-swh.scrubber
Architecture: all
Depends: ${misc:Depends}, ${python3:Depends},
Description: Software Heritage Datastore Scrubber
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Files: *
Copyright: 2015-2022 The Software Heritage developers
License: GPL-3+
License: GPL-3+
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
.
On Debian systems, the complete text of the GNU General Public
License version 3 can be found in `/usr/share/common-licenses/GPL-3'.
[DEFAULT]
upstream-branch=debian/upstream
upstream-tag=debian/upstream/%(version)s
upstream-vcs-tag=v%(version)s
debian-branch=debian/unstable-swh
pristine-tar=True
#!/usr/bin/make -f
export PYBUILD_NAME=swh.scrubber
export PYBUILD_TEST_ARGS=-vv
%:
dh $@ --with python3 --buildsystem=pybuild
override_dh_install:
dh_install
rm -v $(CURDIR)/debian/python3-*/usr/lib/python*/dist-packages/swh/__init__.py
3.0 (quilt)
[flake8]
# E203: whitespaces before ':' <https://github.com/psf/black/issues/315>
# E231: missing whitespace after ','
# W503: line break before binary operator <https://github.com/psf/black/issues/52>
ignore = E203,E231,W503
max-line-length = 88
[egg_info]
tag_build =
tag_date = 0