- Sep 23, 2021
-
-
Raphaël Gomès authored
We're about to have a Bazaar loader
-
- Sep 16, 2021
-
-
vlorentz authored
- Jul 27, 2021
-
-
Stefan Sperling authored
-
- Jul 23, 2021
-
-
Nicolas Dandrimont authored
This allows distinguishing multiple potential versions of the mapping between external objects and their counterparts archived in Software Heritage, for instance when a loader has a backwards-incompatible change that should result in objects being loaded again. The field defaults to zero, in which case it's backwards-compatible with the previous implementation in terms of identifier computation.
-
- Jul 02, 2021
-
-
Daniele Serafini authored
-
- Jun 25, 2021
-
-
vlorentz authored
We agreed a while ago they should be IRIs and not just URIs. This will trigger crashes in swh.storage.cassandra, as currently expects (wrongly) that origin urls are ASCII.
-
vlorentz authored
* empty fetcher name or version is not accepted by cassandra (and is nonsensical anyway) * ditto for non-ASCII (and any non-printable is nonsensical) * null bytes/chars are accepted by neither postgresql or cassandra
-
- Jun 21, 2021
-
-
Daniele Serafini authored
Closes T3393
-
- Jun 15, 2021
-
-
Daniele Serafini authored
- add typing annotation to avoid such error in the future Fixes T3383
-
David Douard authored
-
David Douard authored
the problem was for datetime<epoch, the timestamp is negative, but since it's a float that includes the microseconds, if both are true (< epoch and microsecond > 0), then the computed (int) timestamp was off by one. Add dedicated tests for this.
-
- Jun 11, 2021
-
-
Daniele Serafini authored
-
- Jun 09, 2021
-
-
Antoine Lambert authored
-
- May 19, 2021
-
-
David Douard authored
make sure the snapshot id in OriginVisitStatus refers to existing Snapshot objects.
-
- May 11, 2021
-
-
vlorentz authored
The git_object is what will be actually useful to the vault. It's also easier to test, because test_identifier.py has the entire git_object in its test data.
-
vlorentz authored
Before this commit, manifests were only computed internally before hashing, so they were not available to outside modules. This makes testing the module very painful, because identifier functions can only be tested by checking the hash; so test failures did not show mismatches between the computed manifest and the expected one. Additionally, the 'git bare cooker' of the vault is likely to use these as well, as it needs to format git objects in the same format.
-
- Apr 23, 2021
-
-
David Douard authored
and add a test to keep them correct.
-
- Apr 15, 2021
-
-
Antoine Pietri authored
-
vlorentz authored
-
- Apr 13, 2021
-
-
Antoine Lambert authored
According to the SWHID specification, it is not forbidden for a qualifier value to contain a '=' character (for instance in origin URL). So update parsing code to handle that special case.
-
Antoine Lambert authored
Some ValidationError exceptions could not be serialized to string due to these format errors. Related to T3234
-
- Apr 12, 2021
-
-
vlorentz authored
-
- Apr 09, 2021
-
-
vlorentz authored
And show nice human-readable errors instead
-
- Apr 08, 2021
-
- Mar 26, 2021
-
-
Antoine Pietri authored
Some releases don't have authors and date fields, this case should be checked in the tests.
-
- Mar 18, 2021
-
-
Nicolas Dandrimont authored
This truncation is already enshrined at the identifier level. Truncate the object itself as well, to reduce the possibility multiple different metadata objects with the same identifier.
-
- Mar 12, 2021
-
- Mar 10, 2021
-
-
David Douard authored
this object aims at being able to keep in the SWH Archive an SWHID <-> External object ID map, e.g. to be able to keep track of Mercurial ids so the Mercurial loader can be made more efficient. Related to T2849.
-
- Mar 08, 2021
-
-
David Douard authored
was modifying the dict given as argument.
-
- Mar 04, 2021
-
-
vlorentz authored
The rounding algorithm wasn't specified
-
vlorentz authored
Serializing as ISO8601 makes the hash brittle, because the database may change the timezone silently and/or lose precision in the microseconds. As we do not need precise timestamp, using an integer is good enough, and is consistant with the git format. The manifest also does not need to contain a timezone, as it only represents the timezone of the system that fetched this metadata, which is useless data.
-
vlorentz authored
So that they can be properly deduplicated and referenced.
-
vlorentz authored
This will be used to compute an intrisic identifier for RawExtrinsicMetadata; which can be used for deduplication and refering to it like any other sha1_git instead of needed to use a tuple of its fields.
- Mar 03, 2021
-
-
vlorentz authored
- Mar 01, 2021
-