- Mar 22, 2024
-
-
vlorentz authored
-
- Mar 05, 2024
-
-
- Feb 13, 2024
-
-
Antoine Lambert authored
It enables to filter on a specific visit type when searching a visit by date. Related to swh-web#4786.
-
- Feb 09, 2024
-
-
Antoine Lambert authored
-
- Feb 06, 2024
-
-
Antoine Lambert authored
Related to swh/meta#5075.
-
- Feb 02, 2024
-
-
Nicolas Dandrimont authored
swh.storage doesn't actually declare the dependency to pytest-postgresql, it comes through swh.core[testing].
-
- Jan 17, 2024
-
-
David Douard authored
Where async usage got dropped from the discovery protocol.
-
- Dec 11, 2023
-
-
David Douard authored
-
- Dec 05, 2023
-
-
David Douard authored
-
Antoine Lambert authored
-
- Dec 04, 2023
-
-
David Douard authored
and replace comment type annotations by explicit ones.
-
- Dec 03, 2023
-
- Nov 29, 2023
-
-
David Douard authored
-
David Douard authored
-
- Nov 25, 2023
-
- Nov 24, 2023
-
-
Antoine Lambert authored
When raising a QueryTimeout exception, forward the arguments of the caught psycopg2 QueryCanceled exception to it.
-
- Nov 16, 2023
-
-
David Douard authored
Convert README from markdown to ReST to make it embeddable in docs/index.rst
-
- Nov 07, 2023
-
-
Jérémy Bobbio (Lunar) authored
`swh storage remove-old-object-reference-partitions 2023-09-01` can be used to remove all partition tables for weeks before the given date. By default, this will print the weeks for which tables would be dropped and ask for a confirmation. The `--force` option will just proceed directly. The command will simply refuse to drop all partitions as it most probably an error.
-
Jérémy Bobbio (Lunar) authored
In order to be able to remove older partitions, we want to list those who actually exist. `object_references_list_partition()` will return a list of ObjectReferencesPartition, a new dataclass describing partitions. This new method replace `get_object_references_partition_bounds()` that was only available in tests.
-
vlorentz authored
-
vlorentz authored
-
- Oct 04, 2023
-
-
Jérémy Bobbio (Lunar) authored
In order to be able to handle takedown notices, we need to be able to remove objects from storage. Depends !1077 Related to swh-alter#5
-
Jérémy Bobbio (Lunar) authored
This is a pretty direct adaptation of what was done in https://gitlab.softwareheritage.org/swh/devel/snippets/-/blob/0d8b6877/takedowns/gen_removal_sql.py Closes: #4687 Depends on !1077
-
Jérémy Bobbio (Lunar) authored
swh-alter needs an interface in order to remove objects from the storage (so we can handle takedown notices). The chosen interface is optimal from swh-alter point of view: it identifies a whole range of objects with different types that can be removed alongside a given set of origins. Giving all objects to be removed at once might also help with consistency constraints inside the various storage facilities. Only objects from this facility will be removed. The same method should be called on other storage, objstorage, or journal instances where the specified objects need to be removed.
-
- Sep 26, 2023
-
-
David Douard authored
it was using a generator like a list, effectively only inserting one content object per batch. The test needs to be modified to actually show the misbehavior, with the need for a inserting first one object before inserting a list of objects, otherwise the test is green without the fix (the reason is left as an exercice for the curious reader). Also fix a small type inconsistency on content_add() and use the same test pattern in test_content_add().
-
David Douard authored
-
Jérémy Bobbio (Lunar) authored
When browsing methods in StorageInterface, it is very easy to overlook helper functions that happens to lie in the `swh.storage.algos` package. To make it slightly easier to find those, add some “see also” section for listing directory entries, branches, origin visits and origin visit statuses.
-
- Sep 25, 2023
-
-
Antoine R. Dumont authored
This fails in debian bullseye. Reading the storage interface for the 'directory_missing' method, there is no order guarantee in the output. Refs. swh/infra/sysadm-environment#5047
-
- Sep 22, 2023
-
-
Antoine R. Dumont authored
This fails in debian bullseye. Reading the storage interface for that method, there is no order guarantee in the output. Refs. swh/infra/sysadm-environment#5047
-
- Sep 20, 2023
-
-
JAVA_HOME needs to point to the installation directory, not the `java` binary itself.
-
- Sep 19, 2023
-
-
Antoine R. Dumont authored
This fails in debian bullseye for some reasons. Reading the storage interface for that method, there is no order guarantee in the output either. Refs. swh/infra/sysadm-environment#5047
-
- Sep 12, 2023
-
-
Raphaël Gomès authored
The initial implementation was incorrectly put in `swh-loader-core`. This simply moves it (along with its sister change in the other module) to the correct place.
-
- Sep 11, 2023
-
-
Antoine Lambert authored
It fixes debian package build on unstable.
-
- Sep 06, 2023
-
-
David Douard authored
This accepts a file of swhids of objects that are known to be invalid (hash mismatch) but should be replayed anyway (typically because they do exist as is in the original storage). The file is expected to have rows like: swh:1:xxx:<invalid_hex_hash>,<expected_hex_hash> [...] Note that the cli only accepts swhids in the exception file, while the backend (ModelObjectDeserializer) support all HashableObject. But we currently do not need this feature on the cli tool for other object types, and doing it this way is simpler in terms of type annotation.
-
David Douard authored
The idea is that 2 workers may insert similar directories concurrently, thus attempt to create identical DirectoryEntry objects in concurrent transactions, making one of the 2 transaction fail at commit time with a UniqueViolation error. But since rows in a `directory_entry_xxx` table consist only on the triplet `(target, name, perms)` and we run the db in read committed isolation level, when the next query (filling the `tmp_directory` table) in `swh_directory_entry_add()` sql function is executed, the insertion of conflicting rows from other transactions has been committed and is now visible in this transaction, so these conflicts can be simply ignored. Upgrade db version to 190.
-
- Sep 05, 2023
-
-
David Douard authored
Hypothesis is not happy that a few hypothesis-given tests are declated in storage_tests but actually used in several derived test cases (test_postgresql, test_cassandra, etc.) and raises an error pointing to https://hypothesis.readthedocs.io/en/latest/settings.html#hypothesis.HealthCheck.differing_executors This is a dirty solution hiding the effects of the original issue rather than properly fixing the root cause. Move the definition of disabled health checks in each test file rather than in `conftest.py`, since this new health check to disable is not required in algos/test_snapshot.py
-
- Sep 04, 2023
-
-
Antoine Lambert authored
Truncate a keyspace table only if it is not empty when executing the teardown phase of the swh_storage_cassandra_backend_config function scope fixture. This brings a two times speedup when executing all cassandra related tests.
-
Antoine Lambert authored
Bump it from 2 to 30 seconds in order to fix flaky tests on Jenkins.
-
- Sep 01, 2023
-
-
Jayesh authored
- Add a simple query builder for dynamic SQL queries - Add a method to make pagination clause and logic consistent - Refactor 'origin_visit_get_range' using the builder
-
- Aug 31, 2023
-
-
Antoine Lambert authored
-