- May 15, 2023
-
-
vlorentz authored
It is needed by the RPM loader, introduced in swh.loader.core v5.2.0
-
- Apr 26, 2023
-
-
Jenkins for Software Heritage authored
Update to upstream version '5.3.0' with Debian dir 3c9e4d1c4fd291937d636dc8b6f0b4207abecb4d
-
Antoine R. Dumont authored
NodeLoader as in {Content|Directory}Loader. Those can directly ingest respectively a (remote) file (as Content) or a (remote) tarball artifact (as Directory). They are currently listed by the nixguix lister. Depending on the checksum_layout (standard, nar), they will compute checksums differently. With "standard" checksum layout, they are computing usual "swh" checksums (sha1, sha1_git, ...) and validate them. With "nar" checksum layout, they will compute the "nar" checksums and store those (once validated) as ExtID. This keeps the node loader construtor retro-compatible with the previous version. It still deals with `checksums_computation` as `checksum_layout` and falls back to "standard" layout when nothing is provided. Refs. swh/meta#4979
- Apr 25, 2023
-
-
Antoine R. Dumont authored
After validating the nar integrity checksums provided by the lister, this now also stores those as ExtID entries (as much as there are checksum hashes provided at initialization time). This behavior is shared by both Content and Directory loaders. This also only occurs when the `checksums_computation` constructor parameter is set to "nar". Refs. swh/meta#4979
-
Antoine R. Dumont authored
The implementation no longer requires such binary Refs. swh/meta#4979
-
- Apr 13, 2023
-
-
Antoine Lambert authored
The http_retry decorator from swh-lister has been moved to swh-core so we can now use it in swh-loader-core instead of duplicating retry code. Moreover, it also enables to retry HTTP requests on errors like 502, 503 or 504 instead on simply retrying on 429.
-
- Apr 06, 2023
-
-
Antoine Lambert authored
Previously any package loader that produced a snapshot for an origin with dangling branches in it fails to load that origin again as None check was missing when processing the last snapshot. Fix #3449.
-
- Apr 05, 2023
-
-
Antoine Lambert authored
Some packages can have an invalid version string in their changelog which raises a ValueError when attempting to parse it in the get_intrinsic_package_metadata method of the loader. As a consequence each package release with such bogus entry in the changelog was discarded from the snapshot created by the loader. So prefer to get the raw string version instead of parsing it to workaround that issue. Fix #1493.
-
Antoine R. Dumont authored
This pushes options from cli to the Nar class. This also bootstraps tests reusing existing test cases from before (using the nix binary). Refs. swh/meta#4979
-
Antoine R. Dumont authored
This also exposes a `swh nar` cli. ``` $ Usage: swh nar [OPTIONS] DIRECTORY Compute NAR hashes on a directory. Options: -x, --exclude-vcs exclude version control directories -H, --hash-algo [sha256|sha1] -f, --format-output [hex|base32|base64] --debug / --no-debug -h, --help Show this message and exit. ``` Refs. swh/meta#4979
-
- Apr 04, 2023
-
-
Kumar Shivendu authored
-
-
- Mar 27, 2023
-
-
Jérémy Bobbio (Lunar) authored
After switching to the PyData Sphinx theme, the very large table with the loader specification became too wide to be readable. These changes make the table scrollable again, remove the right sidebar, highlight the name column and add stripe to rows. They are stopgap measure as such a table might not the best way to present this information.
-
- Mar 08, 2023
-
-
Antoine Lambert authored
Version number of a pubdev package containing a dash character fails to be parsed by packaging.parse_version. So ensure to split version number by dash to extract parsable part in it and fix loading of such packages. Resolves #4741
-
- Feb 23, 2023
-
-
Jérémy Bobbio (Lunar) authored
GitLab will display the content of the README file when browsing the repository. But in case the file is a symlink, it will display the path pointed by the symlink. There is a 6 year old issue about this: https://gitlab.com/gitlab-org/gitlab/-/issues/15093 We can workaround the issue by having the content at the root of the repository and a symlink to this file in the `docs/` directory. Tested in swh-py-template!27
-
- Feb 20, 2023
-
-
Related to swh/meta#4959
-
- Feb 17, 2023
-
-
Antoine Lambert authored
Related to swh/meta#4960
-
- Feb 13, 2023
-
-
Jenkins for Software Heritage authored
Update to upstream version '5.2.0' with Debian dir 5b054142da7957412626e21ed026d60cfd24eb1c
- Feb 02, 2023
-
-
Antoine Lambert authored
This fixes python 3.7 support due to poetry, a dependency of isort, that removed support for that Python version in a recent release.
-
- Jan 13, 2023
-
-
vlorentz authored
There is code in swh/loader/cli.py, and swh-loader-metadata will need to import cli.py, causing mypy to complain when py.typed is missing.
-
Antoine R. Dumont authored
This introduces a `create_partial_snapshot` parameter to the base loader constructor. When activated, during each call of the `store_data` method, if there are more data to fetch, this will create a partial snapshot (and an associated visit status). The final loop behaves as before, create the last visit with status 'full' targeting the final snapshot. The main difference between the 2 behaviors is that an ingestion with that parameter on is more verbose in terms of origin_visit_status. This, in turn, allows to be incremental in subsequent visits for the same origin. This may especially be interesting for cases when loading fail due to out of hand resources issues (e.g. large svn or git repositories). Related to T3625
-
- Dec 20, 2022
-
-
Antoine Lambert authored
Release 22.0 of packaging module can no longer parse invalid Python version number, an exception is now raised. Conda loader used the keys of the packages dict as version numbers to sort, which are in the form "<arch>/<version>-<build>", but those cannot be parsed anymore. So extract intrinsic version numbers of packages instead to sort the list of versions. Also update snapshot release names to "<version>-<build>-<arch>" as each release for a given architecture targets a different directory.
-
Antoine Lambert authored
Release 22.0 of packaging module can no longer parse invalid Python version number, an exception is now raised. RPM loader used the keys of the packages dict as version numbers to sort, which are in the form "<distribution>/<edition>/<package_version_number>", but those cannot be parsed anymore. So use intrinsic version numbers of packages instead to sort the list of versions.
-
vlorentz authored
-
- Dec 19, 2022
-
-
Antoine Lambert authored
In order to remove warnings about /apidoc/*.rst files being included multiple times in toc when building full swh documentation, prefer to include module indices only when building standalone package documentation. Also include them the proper sphinx way. Related to T4496
-
- Nov 21, 2022
-
-
Franck Bret authored
The loader make an http api call to retrieve package related versions. It then download tar.gz archive for each version.
-
- Nov 16, 2022
-
-
Kumar Shivendu authored
-
- Nov 15, 2022
-
-
Antoine R. Dumont authored
This got migrated in the sole swh-loader-git module using it. Related to D7868
-
- Nov 14, 2022
-
-
Antoine Lambert authored
Some maven artifacts do not have any sha1 sums computed but rather md5 ones so handle these edge cases to still check download integrity of jar files.
-
Antoine Lambert authored
Use mocked network requests to get jar and pom files instead of reading them from the datadir directory.
-
- Nov 03, 2022
-
-
Antoine Lambert authored
It enables to avoid downloading and processing a release archive for a CPAN module if it has already been archived by Software Heritage. Related to T2833
-
- Nov 02, 2022
-
-
Franck Bret authored
provided by the lister extra_loader_arguments Use artifacts and rubygems_metadata to get list of versions, artifacts checksums and extrinsic metadata url Add an EXTID manifest Set metadata from extrinsic metadata
-
- Oct 27, 2022
-
-
Reviewers: #reviewers, anlambert Subscribers: anlambert Maniphest Tasks: T4581 Differential Revision: https://forge.softwareheritage.org/D8569
-
- Oct 26, 2022
-
-
Jenkins for Software Heritage authored
Update to upstream version '5.1.0' with Debian dir 46568b9c977eac5642d6fcc7022a69c5c13455b2