- Feb 05, 2024
-
-
It exist cases where sha256 checksum for a source package is missing, typically for legacy debian releases. So ensure to not return a null extid in DebianPackageInfo class by using sha1 or md5sum checksum instead.
-
It might exist cases where multiple versions of a package target the same release object. For instance a rpm package has one specific version for each distribution release but they can target the same intrinsic version and source package contents are exactly the same. So avoid downloading and processing a package version if the corresponding extid has already been encountered during the current loading by maintaining a mapping between extids and release swhids.
- Feb 02, 2024
-
-
This way also avoids checking the mountpoint permissions or the file owner, which currently results in nar mismatch on (nixguix) origins with executables (for various loaders e.g. tarball, git-checkout, ...) in our production setup. Refs. swh/infra/sysadm-environment#5230
-
Nicolas Dandrimont authored
The referenced requests_mock bug has been fixed upstream
-
Nicolas Dandrimont authored
-
- Jan 26, 2024
-
-
Antoine Lambert authored
Retrying 5 times with an exponential backoff could turn asleep those loaders for more than 2O minutes so reduce the number of retries to 3 to make the loaders sleep for only a couple of seconds instead.
-
Antoine Lambert authored
-
- Jan 17, 2024
-
-
David Douard authored
Where the usage of async got dropped from the discovery protocol.
-
- Jan 16, 2024
-
-
Antoine Lambert authored
requests package set the value of the Accept-Encoding HTTP header to "gzip, deflate" by default and some servers (https://download.ocamlcore.org/ for instance) will then send a compressed version of the artifact to download with response header content-encoding usually set to gzip. Nevertheless, this conflicts with the code checking if the response bytes should be uncompressed as it should not when Content-Encoding header is equal to gzip and Content-Type is equal to application/*gzip. As artifacts to download are usually tarballs already compressed with gzip, set the Accept-Encoding request header to identity in order to force the server to send raw artifact bytes without the Content-Encoding header set.
-
Antoine Lambert authored
It seems a better naming as shorter is better.
-
- Jan 15, 2024
-
-
Antoine Lambert authored
The NodeLoader class handles two checksums layout: - standard: checksum is computed from the raw downloaded artifact bytes - nar: checksum is a NAR hash, recursively computed from source code tree Previously, only nar checksums were stored as ExtIDs while standard ones were only used for integrity checks after downloads. As mapping a tarball (resp. file) standard checksum to its corresponding directory (resp. content) SWHID is of interest for Guix to check if SWH archived this software artifact, ensure to also save standard checksums as ExtIDs of type "checksum-<hash_algo>".
-
- Dec 07, 2023
-
-
Antoine Lambert authored
-
Antoine Lambert authored
Commit 424540f1 broke the CLI launch of VCS loaders as those did not get their swh.workers entrpoints renamed. So revert it and restore previous swh.workers entrypoints names instead.
-
Those located in swh/loader/tests were no longer collected.
-
Entry points related to loaders have been renamed from "loader.<type>" to "swh.loader.<type>" so update code gathering all loader types in a list.
-
- Dec 06, 2023
-
-
Antoine Lambert authored
Upload to PyPI is failing when the README fails to render for 'text/x-rst' so remobe sphinx directives and turn references in it to external links.
-
- Dec 05, 2023
-
-
David Douard authored
-
- Dec 04, 2023
-
-
David Douard authored
-
David Douard authored
For some reason the update of this file has not been applied in the recent revisions. Also simplify a bit the configuration of codespell moving the actual configuration in pyproject.toml.
-
- Dec 03, 2023
-
-
David Douard authored
-
- Nov 29, 2023
-
-
David Douard authored
-
- Nov 17, 2023
-
-
Nicolas Dandrimont authored
This reimplements dir_filter in terms of path_filter to keep the backwards-compatibility with other users of the swh-loader-core API.
-
- Nov 14, 2023
-
-
Antoine Lambert authored
-
- Nov 13, 2023
-
-
Antoine R. Dumont authored
That fails the current loading ingestion as this is expected to be an exact value. Refs. #4746
-
- Oct 04, 2023
-
-
Antoine Lambert authored
Before computing the nar hash in the fetch_data method, detect if fetched artifact is coming from a VCS (git, hg or svn) by checking the visit type of the loader and set vcs_type parameter of Nar constructor accordingly. Related to swh-loader-git#4751.
-
Antoine Lambert authored
When computing the recursive nar hash of a directory fetched from a VCS like git or svn, only special directories related to the used VCS (.git or .svn) should be excluded. For instance when a directory was fetched from git, only .git folders should be excluded. Related to swh-loader-git#4751.
-
- Sep 28, 2023
- Sep 25, 2023
-
-
Antoine Lambert authored
Previously package versions were sorted according to packages dict keys but this is not reliable as older versions can be sorted after newer ones. Prefer to sort package versions according to their build time then as it produces a correct ordering and ensure the HEAD branch alias will target the most recent version of a package.
-
- Sep 19, 2023
-
-
Raphaël Gomès authored
The initial implementation of the discovery algorithm was incorrectly done in this package, it has now been refactored in the appropriate places, so we can just use those.
-
- Sep 18, 2023
-
-
Antoine Lambert authored
Return 1 as exit code when the "swh loader run" command failed, i.e. when the visit status associated to the performed loading is different from "full". It enables to check if a loading failed in scripts calling that command.
-
Antoine Lambert authored
It can be useful in tests or to run the directory loader on a local tarball.
-
Antoine Lambert authored
Case sensitivity is important for that representation as it can lead to invalid decodings otherwise.
-
Antoine Lambert authored
Previously only top level VCS directories were excluded but that behavior does not match the one from the "guix hash -x -S nar" command who recursively excludes those directories when computing a nar hash. So ensure to have the same behavior to avoid hash mismatch issues when using a directory loader. Related to swh-loader-git#4751.
-
vlorentz authored
-
- Sep 14, 2023
-
- Aug 28, 2023
-
-
Antoine Lambert authored
Do no attempt to parse version with the packaging.version module as rpm version format do not match Python package one and thus numerous parsing were failing. Use package intrinsic version in the message for each produced release in order for the loader to not create different releases targeting the same directory. Adapt input data sent by the RPM lister to its latest changes, notably in tests. Related to swh/meta#5011.
-
- Aug 08, 2023
-
-
Antoine R. Dumont authored
The current implementation uses the default filtering which accepts all folders. It's going to be needed at least for the git loader which has to ignore the root .git folder and empty ones. Refs. swh/meta#3781
-
- Jul 25, 2023
-
-
Antoine Lambert authored
The python requests library automatically deflate downloaded content bytes if the response header content-encoding is set to a supported encoding. However some HTTP servers can serve a tarball with content-type set to application/x-gzip and content-encoding set to gzip which is wrong as tarball is uncompressed while downloading it. That behavior can make a file checksum check after download fail as the expected checksum was computed on the compressed version of the file, not the uncompressed one. So ensure to prevent automatic deflate by reading response raw content instead of using the iter_content method when the content-type and content-encoding headers are both set to gzip format.
-
- Jul 10, 2023
-
-
Antoine Lambert authored
The get_enclosed_fields method does not return fields with no value so we need to use the dictionary get method to avoid raising KeyError and fix the loading of packages with missing metadata.
-