- Jun 24, 2024
-
-
Jérémy Bobbio (Lunar) authored
A revision in a log can be missing from the storage. While holes are unusual, they can happen. Dedicated tests were added for the issue. TODO: - [ ] A decision needs to be made on what should be returned by `revision_short_log()` in case of a missing revision.
-
Jérémy Bobbio (Lunar) authored
There is no order guarantee between revision parents when calling `revision_log()`. We now use sets instead of lists to improve tests reliability.
-
- Jun 06, 2024
-
-
Jérémy Bobbio (Lunar) authored
It is fairly easy to overlook calling `storage.flush()` after adding some objects just before the end of a process. In a direct configuration, everything will be fine but if a buffering proxy is configured, this means that the objects still in the buffers are going to be lost. Let’s add a warning if this is the case.
-
Nicolas Dandrimont authored
When deleting a snapshot, deletion removed the corresponding entries in the `snapshot_branch` table. These entries are shared between snapshots (e.g. all snapshots with a `HEAD` pointing to `refs/heads/main` share the same line in this table), and we don't keep a reverse index to efficiently remove orphan lines, so we just skip the removal altogether. Add a test to ensure that the corruption doesn't happen anymore.
-
- Jun 05, 2024
-
-
Jérémy Bobbio (Lunar) authored
These changes are needed to allow `swh alter remove` to remove RawExtrinsicMetadata and ExtID objects from the PostgreSQL and Cassandra implementation of the storage. As RawExtrinsicMetadata objects can be addressed using an ExtendendSWHID, they can now be removed using `object_delete()`. For ExtID, a new method `extid_delete_for_target()` has been added to `ObjectDeletionInterface`. It will remove any ExtID object targeting one of the object in the a list of SWHID given as an argument. Address swh-alter!11
-
Jérémy Bobbio (Lunar) authored
The cursor provided by the `@db_transaction` decorator was not being passed properly.
-
- Jun 04, 2024
-
-
Nicolas Dandrimont authored
To be able to use the `swh db` command line, the configurations need to have the following shape: <toplevel key>: cls: postgresql db: <postgresql DSN> The usage of `blocking_db` and `masking_db` instead of plain `db`, and lack of overridability in the swh db utilities, made schema migrations more annoying than necessary. The old configuration key still works but raises a Deprecation warning. The configuration of the proxies themselves is unchanged, as having a differently-named key for the proxy-specific database makes sense to avoid confusion.
-
- Jun 03, 2024
-
-
Nicolas Dandrimont authored
The prefix matching semantics rely on URL patterns not ending with a /. URLs ending in / will only be used for exact matches. Warn the operator of this situation.
-
Nicolas Dandrimont authored
Instead of doing some collation-dependent, somewhat brittle prefix matching in PostgreSQL, generate a list of checked prefixes and match them exactly in the database. This means we can do both exact and prefix matching in one go.
-
Antoine Lambert authored
If that parameter is provided, it enables to get the latest snapshot produced by a specific visit type. Related to swh/meta#5092.
-
- May 31, 2024
-
-
This is an alternative to modifying the `person` table, as that technique only worked on the PostgreSQL backend and not with Cassandra, as it does not have a `person` table (authors and committers are inlined in the revision and release tables). This does not come with cli tooling to manage the display name table entries for now.
-
- May 28, 2024
-
-
Pierre-Yves David authored
I am trying to use mypy to detect psycopg2 → spycopg3 difference and the noise is getting in the way.
-
Pierre-Yves David authored
This let us types various function mypy was warning about.
-
- May 17, 2024
-
-
Pierre-Yves David authored
-
Pierre-Yves David authored
mypy now understand that BaseModel object have a object_type.
-
Pierre-Yves David authored
-
- May 15, 2024
-
-
David Douard authored
This add a 'blocking_origin_log' table in the database where each filtering event is logged, be it a deny or an expliciy accept event.
-
David Douard authored
This proxy prevent registered origins from being visited again. If an origin url is matching a blocking rule, then any attempt to add an Origin, OriginVisit or OriginVisitStatus object targeting this url will be blocked, raising a BlockedOriginException. This is implemented in a similar fashion than the MaskingProxy, sharing the same management logic as this later. The url matching rules are, given a checked URL: - check for an exact match in the blocking rules on: 1. the given URL 2. the trimmed URL (if it has a trailing /) 3. the extension-less URL if it ends with a know suffix (eg. '.git') - if no exact match is found, look for the best prefix match on split sub-path urls (aka the longest url match in the blocking rules for which the URL starts with the match, splitting on '/')
-
- May 14, 2024
-
-
vlorentz authored
This will be used by swh-dataset to list all SWHIDs to mask before an export, instead of querying the database over and over while exporting.
-
- May 06, 2024
-
- May 03, 2024
-
-
vlorentz authored
-
- May 02, 2024
-
-
David Douard authored
Make the MaskingProxy uses its own db rather than sharing the main storage one with fragile use of flavors to shoehorn it in there. The migration script for the main storage db requires to recreate the db flavor table and (enum) type since this later cannot be altered to remove an entry. On the masking part, we set the db model version to 193 because the currently deployed version is actually set to the main storage's db version (thus 192).
-
- Apr 11, 2024
-
-
Jérémy Bobbio (Lunar) authored
-
- Apr 02, 2024
-
-
Nicolas Dandrimont authored
inspect.getcallargs will return `args`/`kwargs` unless the method is wrapped using `functools.wraps` at every layer, and all the signatures are compatible, which is apparently not the case for all proxies.
-
Nicolas Dandrimont authored
Our error handlers were missing an application of the extra encoders, so SWHIDs weren't encodable as exception arguments, and the new MaskedStatuses weren't supported by our encoders.
-
Nicolas Dandrimont authored
-
- Mar 29, 2024
-
-
Nicolas Dandrimont authored
This reverts commit f7588d35. Prometheus keeps the original endpoint tag as `exported_endpoint`, so we can just keep it as such (and avoid losing historical metrics).
-
David Douard authored
-
Online help: Usage: swh storage masking [OPTIONS] COMMAND [ARGS]... Configure masking on archived objects These tools require read/write access to the masking database. An entry must be added to the configuration file as follow: storage: … masking_admin: masking_db: "service=swh-masking-admin" Options: -h, --help Show this message and exit. Commands: clear-request Remove all masking states for the given request history Get the history for a request list-requests List masking requests new-request Create a new request to mask objects object-state Get the masking state for a set of SWHIDs status Get the masking states defined by a request update-objects Update the state of given objects
-
- Mar 28, 2024
-
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
To do this, we generate a dictionary for exact method name matches, and one for suffix matches.
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
Instead of generating a full Content object with the four hashes, which is more expensive.
-
Nicolas Dandrimont authored
We have to explicitly test this codepath as the masking proxy checks the computed swhid of the returned content, so we might as well tell pytest that we know that the call is deprecated.
-
Nicolas Dandrimont authored
To avoid calling __getattr__ multiple times on the same method, we can just use setattr to cache the built method for further calls. This avoids caveats of the LRU cache on instance methods (which can make garbage-collecting difficult).
-
Nicolas Dandrimont authored
The target argument for raw_extrinsic_metadata_get_authorities is an ExtendedSWHID, there's no need to extend it again.
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
Passing it a CoreSWHID works most of the time as the value is stringified before accessing databases, but doesn't work when using ExtendedSWHID methods/attributes directly, for instance in a proxy.
-