- Sep 23, 2021
-
-
vlorentz authored
Refactor identifiers & model to make *_git_object() functions work on model classes instead of dicts Since we now use these classes everywhere, computing hashes required using to_dict() just to compute identifiers, which can be a performance bottleneck in code computing many checksums.
-
vlorentz authored
A future commit will make identifier computation use the attrs classes, which are strict about what they accept.
-
vlorentz authored
identifiers.py initially worked only on bare sha1_git. I chose to add the SWHID classes in that module because of the name, but the SWHID code didn't actually interact with the other functions in the module, so it now feels out of place to me.
-
Raphaël Gomès authored
We're about to have a Bazaar loader
-
- Sep 16, 2021
-
-
vlorentz authored
- Jul 27, 2021
-
-
Stefan Sperling authored
-
- Jul 23, 2021
-
-
Nicolas Dandrimont authored
This allows distinguishing multiple potential versions of the mapping between external objects and their counterparts archived in Software Heritage, for instance when a loader has a backwards-incompatible change that should result in objects being loaded again. The field defaults to zero, in which case it's backwards-compatible with the previous implementation in terms of identifier computation.
-
- Jul 02, 2021
-
-
Daniele Serafini authored
-
- Jun 25, 2021
-
-
vlorentz authored
We agreed a while ago they should be IRIs and not just URIs. This will trigger crashes in swh.storage.cassandra, as currently expects (wrongly) that origin urls are ASCII.
-
vlorentz authored
* empty fetcher name or version is not accepted by cassandra (and is nonsensical anyway) * ditto for non-ASCII (and any non-printable is nonsensical) * null bytes/chars are accepted by neither postgresql or cassandra
-
- Jun 21, 2021
-
-
Daniele Serafini authored
Closes T3393
-
- Jun 15, 2021
-
-
Daniele Serafini authored
- add typing annotation to avoid such error in the future Fixes T3383
-
David Douard authored
-
David Douard authored
the problem was for datetime<epoch, the timestamp is negative, but since it's a float that includes the microseconds, if both are true (< epoch and microsecond > 0), then the computed (int) timestamp was off by one. Add dedicated tests for this.
-
- Jun 11, 2021
-
-
Daniele Serafini authored
-
- Jun 09, 2021
-
-
Antoine Lambert authored
-
- May 19, 2021
-
-
David Douard authored
make sure the snapshot id in OriginVisitStatus refers to existing Snapshot objects.
-
- May 11, 2021
-
-
vlorentz authored
The git_object is what will be actually useful to the vault. It's also easier to test, because test_identifier.py has the entire git_object in its test data.
-
vlorentz authored
Before this commit, manifests were only computed internally before hashing, so they were not available to outside modules. This makes testing the module very painful, because identifier functions can only be tested by checking the hash; so test failures did not show mismatches between the computed manifest and the expected one. Additionally, the 'git bare cooker' of the vault is likely to use these as well, as it needs to format git objects in the same format.
- May 06, 2021
-
-
vlorentz authored
There is a regression that breaks attr.evolve() when updating attributes that contain an attr class; which we use (eg. for Person or TimestampWithTimezone). v21.2.0 is expected to fix the issue, but won't be released immediately: https://github.com/python-attrs/attrs/issues/804#issuecomment-833471190
-
- Apr 30, 2021
-
-
vlorentz authored
-
- Apr 28, 2021
-
-
Antoine Lambert authored
Enable to check package documentation can be built without producing sphinx warnings. The sphinx environment is designed to be used in continuous integration in order to prevent breaking documentation build when committing changes. The sphinx-dev environment is designed to be used inside a full swh development environment. Related to T3258
-
- Apr 23, 2021
-
-
David Douard authored
and add a test to keep them correct.
-
- Apr 15, 2021
-
-
Antoine Pietri authored
-
vlorentz authored
-
- Apr 13, 2021
-
-
Antoine Lambert authored
According to the SWHID specification, it is not forbidden for a qualifier value to contain a '=' character (for instance in origin URL). So update parsing code to handle that special case.
-
Antoine Lambert authored
Some ValidationError exceptions could not be serialized to string due to these format errors. Related to T3234
-
- Apr 12, 2021
- Apr 09, 2021
-
-
vlorentz authored
And show nice human-readable errors instead
-
- Apr 08, 2021
-
- Mar 26, 2021
-
-
Antoine Pietri authored
Some releases don't have authors and date fields, this case should be checked in the tests.
-
- Mar 18, 2021
-
-
Nicolas Dandrimont authored
This truncation is already enshrined at the identifier level. Truncate the object itself as well, to reduce the possibility multiple different metadata objects with the same identifier.
-
- Mar 12, 2021
-
- Mar 10, 2021
-
-
David Douard authored
this object aims at being able to keep in the SWH Archive an SWHID <-> External object ID map, e.g. to be able to keep track of Mercurial ids so the Mercurial loader can be made more efficient. Related to T2849.
-
- Mar 08, 2021
-
-
David Douard authored
was modifying the dict given as argument.
-