- May 21, 2024
-
-
David Douard authored
This should help identify these classes full path independently from how their module is loaded (eg. have a consistent logger name etc.), which can make tests fail for no valid reason.
-
- May 15, 2024
-
-
Pierre-Yves David authored
-
Pierre-Yves David authored
-
- Mar 28, 2024
-
-
vlorentz authored
-
- Mar 05, 2024
-
-
Add an extra check in function fetch_extids_from_checksums to ensure a NAR hash extid matches the NAR hash of the targeted archived object. Related to swh/infra/sysadm-environment#5256.
-
- Feb 26, 2024
-
-
Antoine R. Dumont authored
Currently, those impacted loaders are only used through the nixguix stack but they are not specific to nixguix. Refs. swh/devel/swh-loader-core#4749
-
Antoine R. Dumont authored
It has been decommissionned in production and replaced by a lister and various listers (as per the README). Refs. swh/meta#3781
-
- Feb 16, 2024
-
-
Antoine R. Dumont authored
This now matches the tarball loader behavior (top-level directory included [1]). This also matches what's expected by the guix dataset. As the nix hashes computed are done from the first directory included in the tarball though, we must also provide that directory. That way, the hashes checks done during ingestion can match appropriately. That was the initial implementation. In terms of data, as this will change the visit snapshot and the extid mappings, the core loaders (NodeLoader, ...) now declares an extid_version bumped to 1 (it was 0 by default). Which means that all extid mappings will be recomputed. [1] https://gitlab.softwareheritage.org/swh/devel/swh-loader-core/-/blob/master/swh/loader/package/loader.py?ref_type=heads#L829-837 Refs. swh/infra/sysadm-environment#5222
-
- Feb 05, 2024
-
-
Antoine Lambert authored
Related to swh/meta#5075.
-
It exist cases where sha256 checksum for a source package is missing, typically for legacy debian releases. So ensure to not return a null extid in DebianPackageInfo class by using sha1 or md5sum checksum instead.
-
It might exist cases where multiple versions of a package target the same release object. For instance a rpm package has one specific version for each distribution release but they can target the same intrinsic version and source package contents are exactly the same. So avoid downloading and processing a package version if the corresponding extid has already been encountered during the current loading by maintaining a mapping between extids and release swhids.
-
- Feb 02, 2024
-
-
This way also avoids checking the mountpoint permissions or the file owner, which currently results in nar mismatch on (nixguix) origins with executables (for various loaders e.g. tarball, git-checkout, ...) in our production setup. Refs. swh/infra/sysadm-environment#5230
-
Nicolas Dandrimont authored
The referenced requests_mock bug has been fixed upstream
-
Nicolas Dandrimont authored
-
- Jan 26, 2024
-
-
Antoine Lambert authored
Retrying 5 times with an exponential backoff could turn asleep those loaders for more than 2O minutes so reduce the number of retries to 3 to make the loaders sleep for only a couple of seconds instead.
-
Antoine Lambert authored
-
- Jan 17, 2024
-
-
David Douard authored
Where the usage of async got dropped from the discovery protocol.
-
- Jan 16, 2024
-
-
Antoine Lambert authored
requests package set the value of the Accept-Encoding HTTP header to "gzip, deflate" by default and some servers (https://download.ocamlcore.org/ for instance) will then send a compressed version of the artifact to download with response header content-encoding usually set to gzip. Nevertheless, this conflicts with the code checking if the response bytes should be uncompressed as it should not when Content-Encoding header is equal to gzip and Content-Type is equal to application/*gzip. As artifacts to download are usually tarballs already compressed with gzip, set the Accept-Encoding request header to identity in order to force the server to send raw artifact bytes without the Content-Encoding header set.
-
Antoine Lambert authored
It seems a better naming as shorter is better.
-
- Jan 15, 2024
-
-
Antoine Lambert authored
The NodeLoader class handles two checksums layout: - standard: checksum is computed from the raw downloaded artifact bytes - nar: checksum is a NAR hash, recursively computed from source code tree Previously, only nar checksums were stored as ExtIDs while standard ones were only used for integrity checks after downloads. As mapping a tarball (resp. file) standard checksum to its corresponding directory (resp. content) SWHID is of interest for Guix to check if SWH archived this software artifact, ensure to also save standard checksums as ExtIDs of type "checksum-<hash_algo>".
-
- Dec 07, 2023
-
-
Antoine Lambert authored
-
Antoine Lambert authored
Commit 424540f1 broke the CLI launch of VCS loaders as those did not get their swh.workers entrpoints renamed. So revert it and restore previous swh.workers entrypoints names instead.
-
Those located in swh/loader/tests were no longer collected.
-
Entry points related to loaders have been renamed from "loader.<type>" to "swh.loader.<type>" so update code gathering all loader types in a list.
-
- Dec 06, 2023
-
-
Antoine Lambert authored
Upload to PyPI is failing when the README fails to render for 'text/x-rst' so remobe sphinx directives and turn references in it to external links.
-
- Dec 05, 2023
-
-
David Douard authored
-
- Dec 04, 2023
-
-
David Douard authored
-
David Douard authored
For some reason the update of this file has not been applied in the recent revisions. Also simplify a bit the configuration of codespell moving the actual configuration in pyproject.toml.
-
- Dec 03, 2023
-
-
David Douard authored
-
- Nov 29, 2023
-
-
David Douard authored
-
- Nov 17, 2023
-
-
Nicolas Dandrimont authored
This reimplements dir_filter in terms of path_filter to keep the backwards-compatibility with other users of the swh-loader-core API.
-
- Nov 14, 2023
-
-
Antoine Lambert authored
-
- Nov 13, 2023
-
-
Antoine R. Dumont authored
That fails the current loading ingestion as this is expected to be an exact value. Refs. #4746
-
- Oct 04, 2023
-
-
Antoine Lambert authored
Before computing the nar hash in the fetch_data method, detect if fetched artifact is coming from a VCS (git, hg or svn) by checking the visit type of the loader and set vcs_type parameter of Nar constructor accordingly. Related to swh-loader-git#4751.
-
Antoine Lambert authored
When computing the recursive nar hash of a directory fetched from a VCS like git or svn, only special directories related to the used VCS (.git or .svn) should be excluded. For instance when a directory was fetched from git, only .git folders should be excluded. Related to swh-loader-git#4751.
-
- Sep 28, 2023
- Sep 25, 2023
-
-
Antoine Lambert authored
Previously package versions were sorted according to packages dict keys but this is not reliable as older versions can be sorted after newer ones. Prefer to sort package versions according to their build time then as it produces a correct ordering and ensure the HEAD branch alias will target the most recent version of a package.
-
- Sep 19, 2023
-
-
Raphaël Gomès authored
The initial implementation of the discovery algorithm was incorrectly done in this package, it has now been refactored in the appropriate places, so we can just use those.
-
- Sep 18, 2023
-
-
Antoine Lambert authored
Return 1 as exit code when the "swh loader run" command failed, i.e. when the visit status associated to the performed loading is different from "full". It enables to check if a loading failed in scripts calling that command.
-