- May 06, 2024
-
-
- May 03, 2024
-
-
vlorentz authored
-
- May 02, 2024
-
-
David Douard authored
Make the MaskingProxy uses its own db rather than sharing the main storage one with fragile use of flavors to shoehorn it in there. The migration script for the main storage db requires to recreate the db flavor table and (enum) type since this later cannot be altered to remove an entry. On the masking part, we set the db model version to 193 because the currently deployed version is actually set to the main storage's db version (thus 192).
-
- Apr 11, 2024
-
-
Jérémy Bobbio (Lunar) authored
-
- Apr 02, 2024
-
-
Nicolas Dandrimont authored
inspect.getcallargs will return `args`/`kwargs` unless the method is wrapped using `functools.wraps` at every layer, and all the signatures are compatible, which is apparently not the case for all proxies.
-
Nicolas Dandrimont authored
Our error handlers were missing an application of the extra encoders, so SWHIDs weren't encodable as exception arguments, and the new MaskedStatuses weren't supported by our encoders.
-
Nicolas Dandrimont authored
-
- Mar 29, 2024
-
-
Nicolas Dandrimont authored
This reverts commit f7588d35. Prometheus keeps the original endpoint tag as `exported_endpoint`, so we can just keep it as such (and avoid losing historical metrics).
-
David Douard authored
-
Online help: Usage: swh storage masking [OPTIONS] COMMAND [ARGS]... Configure masking on archived objects These tools require read/write access to the masking database. An entry must be added to the configuration file as follow: storage: … masking_admin: masking_db: "service=swh-masking-admin" Options: -h, --help Show this message and exit. Commands: clear-request Remove all masking states for the given request history Get the history for a request list-requests List masking requests new-request Create a new request to mask objects object-state Get the masking state for a set of SWHIDs status Get the masking states defined by a request update-objects Update the state of given objects
-
- Mar 28, 2024
-
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
To do this, we generate a dictionary for exact method name matches, and one for suffix matches.
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
Instead of generating a full Content object with the four hashes, which is more expensive.
-
Nicolas Dandrimont authored
We have to explicitly test this codepath as the masking proxy checks the computed swhid of the returned content, so we might as well tell pytest that we know that the call is deprecated.
-
Nicolas Dandrimont authored
To avoid calling __getattr__ multiple times on the same method, we can just use setattr to cache the built method for further calls. This avoids caveats of the LRU cache on instance methods (which can make garbage-collecting difficult).
-
Nicolas Dandrimont authored
The target argument for raw_extrinsic_metadata_get_authorities is an ExtendedSWHID, there's no need to extend it again.
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
Passing it a CoreSWHID works most of the time as the value is stringified before accessing databases, but doesn't work when using ExtendedSWHID methods/attributes directly, for instance in a proxy.
-
Nicolas Dandrimont authored
The original implementation masked the snapshots targeted by visits if they were masked, when we really want to mask results only if the queried origin itself is masked.
-
Nicolas Dandrimont authored
-
-
Nicolas Dandrimont authored
This new masking proxy storage intercepts all information retrieval from the underlying storage, and matches the SWHIDs of returned objects to the contents of the masking database. For simplicity, when any of the returned objects matches the masking database, a non-retryable MaskedObjectException is raised, with a dict mapping the masked SWHIDs to information about the masking request, including an opaque id and a masking state (temporary or permanent). It is up to the client to process this exception to display the information in a useful manner. If necessary, a client fetching a batch of objects including some masked and non-masked ones could extract the ids of the masked objects and retry for the non-masked objects as well. If this usage becomes prevalent, it could be implemented as one more proxy. When an object's SWHID (or a list thereof) is passed as argument to the storage function, we first call the underlying function to check the object for existence, before we attempt to match the object with the masking database. This avoids leaking information out of the masking database until it's absolutely needed, avoiding potential issues after a content removal has been processed. For now, our implementation does not consider that the SWHID of masked objects itself needs to be masked. For instance, an unmasked Directory containing masked Contents will still allow being listed. Only accessing the data of the masked Content object itself would raise a MaskedObjectException. This choice was made to limit the impact of masked objects in the overall archive navigation experience.
-
Nicolas Dandrimont authored
Ensure that the masking stuff is fully isolated except for the dbflavor.
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
This is a simple database of the SWHIDs of objects for which we have made a policy decision to restrict the diffusion without removing them from the archive, and a lightweight history structure for the associated object masking requests. Doing this as an overlay, instead of modifying the storage schema for all objects, allows us to start better separating the concerns of archival of origins (which necessitates a full view of all the unmodified objects that are stored in the archive), with the concerns about the dissemination of said archived objects. To avoid interfering with archival, the masking policy will only be applied for full object retrieval and implemented as a new proxy storage, which will be placed in front of all public-facing storages.
-
Nicolas Dandrimont authored
This new type will be used for non-retryable exceptions that will not be storage argument exceptions.
-
Nicolas Dandrimont authored
This helper computes the timing difference between an inner function and an outer block, and sends the result to statsd with a set of tags. This will allow us to measure the overhead of some endpoints.
-
Nicolas Dandrimont authored
The endpoint tag gets overwritten by the Kubernetes service monitors, let's avoid using it.
-
- Mar 27, 2024
-
-
Nicolas Dandrimont authored
The intent behind test_types is to test the signature of wrapped storages, to check that they match that of the StorageInterface Protocol. However, the way the test was refactored ended up testing the storage being *wrapped* by the storage under test, masking a few inconsistencies in the way storages are being wrapped. Unfortunately this breaks the tenacious proxy's test in an inscrutable way (even when `functools.wraps`ing the return values of its `__getattr__` function).
-
Nicolas Dandrimont authored
-
Antoine Lambert authored
Since the release of pytest 8.1, some pytest options are no longer needed and editable install can be used when running tests using tox.
-
- Mar 22, 2024
-
-
Antoine R. Dumont authored
This reverts commit 74caf618. Refs. swh/infra/sysadm-environment#5291
-
- Mar 21, 2024
-
-
Antoine R. Dumont authored
But still retrieve empty entries as None in the model when that makes sense. This should not disturb the current api calls when reading from cassandra. Refs. swh/infra/sysadm-environment#5287
-
Antoine R. Dumont authored
-
- Mar 11, 2024
-
-
Nicolas Dandrimont authored
All the data has been migrated, this fallback can now be removed. Ref. swh/infra/sysadm-environment#2564
-
- Mar 05, 2024
-
-
- Feb 13, 2024
-
-
Antoine Lambert authored
It enables to filter on a specific visit type when searching a visit by date. Related to swh-web#4786.
-