- Feb 08, 2022
-
-
Raphaël Gomès authored
There will be a related patch for the hg and the bzr loaders
-
- Feb 03, 2022
-
-
Nicolas Dandrimont authored
This increases overall consistency and makes us compatible with the latest version of swh.storage, which does Person.from_fullname() parsing on output if name and email are None.
-
Nicolas Dandrimont authored
-
- Jan 25, 2022
-
-
vlorentz authored
No need to filter out revisions as well, this is already handled elsewhere. Along with D7028, this resolves T3884.
-
- Jan 24, 2022
-
-
vlorentz authored
They may point to non-existing objects, which is useless.
-
- Jan 21, 2022
-
-
Raphaël Gomès authored
This will be also used by the new bzr loader. A separate patch will refactor this in the hg loader.
-
vlorentz authored
It will be replaced by what is currently called 'offset_bytes'
-
vlorentz authored
-
- Jan 14, 2022
-
- Jan 13, 2022
-
-
vlorentz authored
swh-deposit v0.17.0 removes it, to match the removal from swh-model v5.0.0
-
- Jan 11, 2022
- Dec 22, 2021
-
-
vlorentz authored
-
- Dec 16, 2021
-
-
Antoine R. Dumont authored
This also drops spurious copyright headers to those files if present. Related to T3812
-
- Dec 09, 2021
-
- Dec 08, 2021
-
-
vlorentz authored
This solves two problems: 1. if the URL changes but the content doesn't, then the new snapshot would keep using the release with the old URL in its name. 2. if there are two URLs pointing to the same content, the base loader would crash because it cannot know which one to pick.
-
vlorentz authored
-
vlorentz authored
instead of just its netloc, as it is possibly to have multiple maven instances hosted under the same domain but at different paths. The code is also simpler this way.
-
- Dec 07, 2021
-
-
vlorentz authored
Snapshots should only record versions that currently exist; even if they used to exist in a previous visits. If readers of the archive want to access deleted versions, than can look up older snapshots.
-
vlorentz authored
-
vlorentz authored
We don't need it to be ordered; and '.keys()' is redundant.
-
vlorentz authored
-
vlorentz authored
-
vlorentz authored
-
vlorentz authored
It was copied from the Archive Loader, but is not needed here.
-
vlorentz authored
Use only the intrinsic version (eg. 1.0.0) instead of the extrinsic version (eg. stretch/contrib/1.0.0). Releases should only contain data from DSC, not external 'pointers' to them. Additionally, having extrinsic data in releases means the same dsc-sha256 extid can point to different releases, which meant the loader may reuse a release mentioning a specific suite as a release in a different suite. With this commit, this won't be a problem anymore, as releases won't mention the suite at all, so suites can safely share extids.
-
vlorentz authored
'version' was documented as the intrinsic version (eg. '0.7.2-3') and 'full_version' as the one containing the suite name (eg. 'stretch/contrib/0.7.2-3'). In practice, it was the opposite, except in a few incorrect test. This commit fixes said tests, and renamed 'full_version' to 'intrinsic_version'. This is only a refactoring, the behavior is unchanged for now; but a future commit will remove the 'version' (which is extrinsic) from the release name (which should contain only data intrinsic to the DSC).
-
- Dec 06, 2021
-
-
Antoine Lambert authored
In order to check successful download of a package file, the debian loader will compare sha256 or sha1 checksum of the file with the one located in debian dsc file. However for old debian-based distributions (some ubuntu old releases for instance) the only available checksum in the dsc file is a md5 sum. So add a fallback to use md5 sum to check successful download when sha* checksum is missing in the dsc file. Related to T2400
-
Boris Baldassari authored
The maven loader loads jar and zip files as Maven artefacts into the software heritage archive. Note: Supersedes D6158 and addresses the review done in that diff. Related to T1724
-
- Dec 03, 2021
-
-
Antoine R. Dumont authored
Related to T3763
-
Antoine R. Dumont authored
So package loaders can actually finish their ingestion even when multiple releases target the same directory. Related to T3763
-
Antoine Lambert authored
Loading task function must be named load_{visit_type} in order for the scheduler to sucessfully create loading tasks. Visit type name for debian packages is deb so the loading task function must be renamed to load_deb. Related to T2400
-
Antoine Lambert authored
Some debian source package metadata have extra sha1 sums for their files, for instance those from the ubuntu hirsute suite. So add an optional sha1 field in DebianFileMetadata model in order to avoid loading errors. Related to T2400
-
Antoine Lambert authored
-
- Dec 02, 2021
-
-
vlorentz authored
To match the current version of the code.
-
- Dec 01, 2021
- Nov 22, 2021
-
-
vlorentz authored
Authors: use the empty string '' instead of placeholders Message: use the same message format (inspired by the Debian loader) for all loaders, instead of the empty string / the version / something else; except for PyPI and Deposit (which have a better format because we have more metadata available). Additionally, this commit adds test of each release object, instead of only relying on its hash.