- Mar 01, 2021
-
- Feb 23, 2021
-
-
vlorentz authored
-
vlorentz authored
* Quote/unquote path * Fix line parsing and serializing to properly handle None * Fix error raised by check_visit/check_anchor
-
vlorentz authored
by making them all derive from an abstract class.
-
vlorentz authored
Following the discussion on T3034, we decided to replace SWHID with two or three classes: * QualifiedSWHID to replace the existing SWHID (standard types + qualifiers) * CoreSWHID, for "core SWHID" only (standard types + no qualifiers) * ExtendedSWHID for internal use in Software Heritage (extra types + no qualifiers) This commit adds the last one. It also removes "ori" as a valid object type for CoreSWHID and QualifiedSWHID, as it now only belongs in ExtendedSWHID.
-
- Feb 19, 2021
-
-
vlorentz authored
It is cleaner, avoids warnings, and will be needed when introducing ExtendedSWHID in a future commit.
-
vlorentz authored
And store their parsed values (CoreSWHID, tuple of ints, etc.) instead of string.
-
vlorentz authored
Following the discussion on T3034, we decided to replace SWHID with two or three classes: * QualifiedSWHID to replace the existing SWHID (standard types + qualifiers) * CoreSWHID, for "core SWHID" only (standard types + no qualifiers) * ExtendedSWHID for internal use in Software Heritage (extra types + no qualifiers) This commit adds the second one
-
vlorentz authored
Following the discussion on T3034, we decided to replace SWHID with two or three classes: * QualifiedSWHID to replace the existing SWHID (standard types + qualifiers) * CoreSWHID, for "core SWHID" only (standard types + no qualifiers) * ExtendedSWHID for internal use in Software Heritage (extra types + no qualifiers) Since migrating from SWHID will break existing code, this commit uses the opportunity to modernize it a little, ie.: * `keyword`-only constructor, to get rid of the hacky default values for `object_type` and `object_id` * enum instead of strings for the object type * `bytes` instead of an hex string for the object id * rename `metadata` to `qualifiers`
-
- Dec 30, 2020
-
-
Stefano Zacchiroli authored
Before this change there was a lot of overlap between parse_swhid() and the attrs-based validators in the SWHID class. Also, the validation implementation in parse_swhid() was done by hand. With this change the coarse-grained validation done by parse_swhid() is now delegated to a regex. The semantic validation of SWHIDs is left to attrs validators. The regex is also exposed as a module attribute, to be used by client code that want to syntactically validate SWHIDs without necessarily instantiate SWHID classes (we have several other modules doing that already, and they are using slightly different hand-made regexs, which isn't great). As part of this change we also clean up the use of ValidationError exceptions, systematically passing the problematic parts of SWHID as arguments, and uniform error messages. This change also brings some speed up in SWHID parsing. On a benchmark parsing ~30 M valid SWHIDs, the previous implementation took ~3:06 minutes, the new one ~2:50 minutes, or a ~9% speedup. Closes T2788
-
- Nov 12, 2020
-
-
Antoine R. Dumont authored
Related to T2729
-
Antoine R. Dumont authored
So parse_swhid raises a ValidationError when that is detected. Related to T2769
-
Antoine R. Dumont authored
Related to T2769
-
Antoine R. Dumont authored
Related to T2769
-
- Oct 14, 2020
-
-
Nicolas Dandrimont authored
This collapses the shared logic between these two identifier computations into a few more explicit steps: - generate data for the manifest (in either identifier computation); - format the manifest (in the new format_manifest function); - hash the manifest (in the new hash_manifest function). This will enable reusing this logic for more object types, as well as stronger typing for the manifest computation.
-
- Sep 18, 2020
-
-
Thibault Allançon authored
Use the new SWHID naming convention instead of SWH PID.
-
- Sep 17, 2020
-
-
Antoine Lambert authored
Related to T2610
-
- Jul 08, 2020
-
-
Antoine Lambert authored
-
- Jul 07, 2020
- Jul 06, 2020
-
-
David Douard authored
Add a new extra_headers attribute on Revision and use it for computing the revision's id instead of extract it from the metadata field. Only accept (bytes, bytes) as extra_header. Add a post init hook to Revision to initialize this new attribute from given metadata, if any, for bw compat. Also amend the revision_d hyptothesis strategy to generate extra_headers.
-
- Jul 03, 2020
-
-
Antoine Lambert authored
When Software Heritage persistent identifiers were introduced, they were not yet abbreviated as SWHIDs. Now that abbreviation is growing adoption, rename some functions and types in swh.model.identifiers for consistency: - PersistentId -> SWHID - persistent_identifier -> swhid - parse_persistent_identifier -> parse_swhid Backward compatibility with previous naming is maintained but deprecation warnings are introduced to encourage the use of the new names. Numerous variables in swh.model codebase have also been renamed accordingly. Also rework and improve documentation.
-
- Jun 15, 2020
-
-
David Douard authored
thus in TimestampWithTimezone.from_dict(). This is needed to help consuming existing (invalid) messages from kafka. Warning: tests added in this revision do not cover the whole normalize_timestamp() function.
-
- Apr 17, 2020
-
-
Stefano Zacchiroli authored
-
- Apr 08, 2020
-
-
David Douard authored
- blackify all the python files, - enable black in pre-commit, - add a black tox environment.
-
- Mar 23, 2020
-
-
Antoine Pietri authored
-
- Feb 21, 2020
-
-
vlorentz authored
It should be cheap enough to do it, and it makes tests easier.
-
- Nov 29, 2019
-
-
Antoine Lambert authored
-
- Oct 06, 2019
-
-
Stefano Zacchiroli authored
-
- Oct 04, 2019
-
-
... from test_persistent_identifier. Closes T1986
-
- Sep 20, 2019
-
-
Stefano Zacchiroli authored
-
Stefano Zacchiroli authored
-
- Aug 23, 2019
-
-
Stefano Zacchiroli authored
-
- Jul 10, 2019
-
-
vlorentz authored
-
- Jun 27, 2019
-
-
Ishan Bhanuka authored
-
- Apr 08, 2019
-
-
vlorentz authored
-
- Apr 04, 2019
-
- Sep 21, 2018
-
-
Antoine R. Dumont authored
-
- Jul 20, 2018
-
-
Antoine R. Dumont authored
Related T1152
-