- Jan 07, 2022
-
- Dec 22, 2021
-
-
Nicolas Dandrimont authored
blake2s and blake2b have been provided by the stdlib hashlib since Python 3.6, and we declare 3.7 as minimum Python version supported.
-
vlorentz authored
1. Most objects do not need it so it's a waste of space 2. This means we just extend the existing format (some objects will have that key in their dict) instead of changing it (retroactively adding it to all objects)
-
vlorentz authored
This will be used to store the original manifest of 'weird' git objects, when we cannot reasonably represent them otherwise.
-
- Dec 21, 2021
- Dec 17, 2021
-
-
vlorentz authored
Revision and release do not generally allow 'arbitrary' metadata; and it was missing ExtIDs and REMD
-
- Dec 16, 2021
-
-
Antoine R. Dumont authored
This also drops spurious copyright headers to those files if present. Related to T3812
-
- Dec 15, 2021
-
-
vlorentz authored
Using .now() produces data that differs between xdist processes, as files are imported after forking, and xdist requires consistent data across processes.
-
- Dec 08, 2021
-
-
vlorentz authored
It calls attr.validate() (which calls the validators), and recomputes the hash of HashableObject instances. A future commit will also make it check the raw_manifest attribute when relevant
-
vlorentz authored
It's just simpler this way
-
vlorentz authored
For the sake of completeness (a future commit may depend on it).
-
vlorentz authored
For now it is filled from 'offset' and 'negative_utc', but it will replace them in a future commit. This is to simplify and add support for more 'weird' offsets we do not currently support.
-
Antoine Lambert authored
It enables to easily check if a path exists from a root directory.
-
- Dec 07, 2021
-
-
Antoine Lambert authored
Since rDMOD8d96dfedee34203a4118e48a6208ee507511590b, directory entry names are validated in DirectoryEntry model and thus must not contain any slash characters. So update directory_entries_d hypothesis strategy to ensure such names are generated to avoid flaky tests.
-
- Dec 06, 2021
-
-
Antoine Lambert authored
Enable to compute md5 sum through the hashutil.MultiHash class. Nevertheless, md5 is not put in DEFAULT_ALGORITHMS set and must be explicitely requested by client code. Related to T2400
-
Antoine Lambert authored
-
- Dec 01, 2021
-
-
vlorentz authored
I don't know any instance of these, but there is no harm in checking them.
-
- Nov 05, 2021
-
-
vlorentz authored
1. hashes are now repr()ed as `hash_to_bytes("1234...")` instead of b"\x12\x34..."` 2. SWHID objects are now repr()ed as `CoreSWHID.from_string('swh:1:...:1234...')` instead of `CoreSWHID(scheme='swh', version='1', object_type=..., object_id=b'\x12\x34')` 3. enums are now repr()ed as `MyEnum.NAME` instead of "<MyEnum.NAME: 'value'>` Thanks to these three changes, using repr() on a model object now prints a string that can be pasted directly in a `.py` file to write a new test case.
-
- Oct 01, 2021
-
-
vlorentz authored
The previous replaced attrs-strict's type validator with our own, stricter and faster, validator. However, the strictness can be a burden in other packages; for example, swh-storage tests rely on it to insert dummy data that raises exception when accessed, and it would be hard to do while using the exact expected type. This commit reverts the strict behavior, but keeps the performance optimization, by always checking with type equality, but in case type equality fails (which would raise an error before this commit), it gives the value a 'second chance', by trying isinstance. This means that, outside tests, isinstance should not be used at all, or very rarely.
-
- Sep 28, 2021
-
-
vlorentz authored
This reimplements attrs_strict.type_validator(), using type equality instead of isinstance. This makes my checksum validation script (that mostly just instantiates model objects, computes a checksum, then discard) run twice as fast.
-
- Sep 24, 2021
-
- Sep 23, 2021
-
-
vlorentz authored
-
vlorentz authored
For consistency, as the classes are now in swhids.py
-
vlorentz authored
+ raise warnings
-
vlorentz authored
1. Add a warning 2. Move identifier/manifest documentation to git_objects.py 3. Remove all imports of that module. Motivation: * SWHID classes were moved to swhids.py * manifest computation functions were moved to git_objects.py * Only reexports and trivial wrappers of model.py remain
-
vlorentz authored
-
vlorentz authored
They are not used anywhere.
-
vlorentz authored
Since they are used by the vault for non-identifier-related stuff, I think it makes sense to move them to a new module. identifiers.py is now an empty shell, as all its features were moved to other modules and it only contains reexports and backward-compat functions. Therefore, it should be considered deprecated from now on.
-
vlorentz authored
Refactor identifiers & model to make *_git_object() functions work on model classes instead of dicts Since we now use these classes everywhere, computing hashes required using to_dict() just to compute identifiers, which can be a performance bottleneck in code computing many checksums.
-
vlorentz authored
A future commit will make identifier computation use the attrs classes, which are strict about what they accept.
-
vlorentz authored
identifiers.py initially worked only on bare sha1_git. I chose to add the SWHID classes in that module because of the name, but the SWHID code didn't actually interact with the other functions in the module, so it now feels out of place to me.
-
Raphaël Gomès authored
We're about to have a Bazaar loader
-
- Sep 16, 2021
-
-
vlorentz authored
- Jul 27, 2021
-
-
Stefan Sperling authored
-
- Jul 23, 2021
-
-
Nicolas Dandrimont authored
This allows distinguishing multiple potential versions of the mapping between external objects and their counterparts archived in Software Heritage, for instance when a loader has a backwards-incompatible change that should result in objects being loaded again. The field defaults to zero, in which case it's backwards-compatible with the previous implementation in terms of identifier computation.
-