- Aug 28, 2023
-
-
Antoine Lambert authored
Do no attempt to parse version with the packaging.version module as rpm version format do not match Python package one and thus numerous parsing were failing. Use package intrinsic version in the message for each produced release in order for the loader to not create different releases targeting the same directory. Adapt input data sent by the RPM lister to its latest changes, notably in tests. Related to swh/meta#5011.
-
- Aug 08, 2023
-
-
Antoine R. Dumont authored
The current implementation uses the default filtering which accepts all folders. It's going to be needed at least for the git loader which has to ignore the root .git folder and empty ones. Refs. swh/meta#3781
-
- Jul 25, 2023
-
-
Antoine Lambert authored
The python requests library automatically deflate downloaded content bytes if the response header content-encoding is set to a supported encoding. However some HTTP servers can serve a tarball with content-type set to application/x-gzip and content-encoding set to gzip which is wrong as tarball is uncompressed while downloading it. That behavior can make a file checksum check after download fail as the expected checksum was computed on the compressed version of the file, not the uncompressed one. So ensure to prevent automatic deflate by reading response raw content instead of using the iter_content method when the content-type and content-encoding headers are both set to gzip format.
-
- Jul 10, 2023
-
-
Antoine Lambert authored
The get_enclosed_fields method does not return fields with no value so we need to use the dictionary get method to avoid raising KeyError and fix the loading of packages with missing metadata.
-
- Jul 04, 2023
-
-
Antoine Lambert authored
Instead of calling opam for each package metadata fields to extract, get all these fields using a single opam call.
-
Antoine Lambert authored
Use subprocess.run instead of subprocess.call and subprocess.Popen and check opam command return codes to catch any possible issues.
-
Antoine Lambert authored
When using the loader not in production context (in docker for instance), the opam root folder is usually not present so ensure to initialize it or update its repositories otherwise the loading of packages will fail.
-
Antoine Lambert authored
opam 2.1 stores packages metadata in a tarball instead of having its content unpacked as with previous opam versions, so the workaround of walking the inner directories of the opam root to get all versions of a package is no longer working with opam 2.1. So use proper way to get all versions of a package by calling the following command: $ opam show --root <opam_root> <package> --field all-versions The previous issue reported in TODO comment was due to the fact that new opam repositories were not added in the default opam switch, this is now fixed and packages metadata from all repositories can be displayed using the opam show command.
-
Antoine Lambert authored
Opam 2.1 removed some CLI options and our tests data is in opam 2.0 format so we need to force their upgrade to 2.1 format.
-
- Jun 21, 2023
-
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
The default behavior of subprocess is to pull executables from a hardcoded list, which doesn't work when opam is installed manually in the user's home directory.
-
Nicolas Dandrimont authored
-
- Jun 08, 2023
-
-
Antoine R. Dumont authored
As it's already running in production, the 'tar' visit type is now immutable. So we cannot change anything related to it. So, instead rename it as before and adapt the check utility functions to allow those checks to be bypassed for some specific cases (like the ArchiveLoader). New loaders should fix their assertion failure immediately though. Refs. swh/infra/sysadm-environment#4906
-
Co-Authored: Antoine Lambert <antoine.lambert@inria.fr>
-
Antoine R. Dumont authored
This: - Adds a generic test on all tasks modules of the package to ensure tasks and lister visit type are ok so scheduling can happen. - Opens an utility function so we can use it on other loader module - fixes the archive loader which was misnamed and adapts accordingly its task and its loader's visit types. This now needs a new deployment and a migration script on the scheduler db for that loader. Refs. swh/infra/sysadm-environment#4906
-
Antoine R. Dumont authored
If there is, that will result in tasks not being scheduled when asked to. So now, this test will specifically catch such error. Some equivalent tests should be declared for all "tasks" modules in various swh packages. This detected the current discrepancy for the TarballDirectoryLoader. Refs. swh/infra/sysadm-environment#4906
-
- Jun 06, 2023
-
-
That should allow centralized configuration for loaders without convolutions in the sysadm dimension. That is generically adding options that may not be understood by all loaders without making them fail. The ones supporting those options will use it. The others not supporting those will just not use it. This is an alternative implementation of loosening the base loader constructor to accept kwargs which could pose side-effecty problems. Refs. swh/infra/ci-cd/swh-charts!62
-
- Jun 01, 2023
-
-
Jérémy Bobbio (Lunar) authored
The recent refactoring done in abe741a9, then used in swh-loader-svn@72dfc41, introduced build issues with the documentation: …/swh/loader/svn/directory.py:docstring of swh.loader.svn.directory.SvnDirectoryLoader.snapshot:1: WARNING: more than one target found for cross-reference 'Snapshot': swh.fuse.fs.artifact.Snapshot, swh.model.model.Snapshot …/swh/loader/svn/directory.py:docstring of swh.loader.svn.directory.SvnDirectoryLoader.cnts:1: WARNING: more than one target found for cross-reference 'Content': swh.fuse.fs.artifact.Content, swh.model.from_disk.Content, swh.model.model.Content …/swh/loader/svn/directory.py:docstring of swh.loader.svn.directory.SvnDirectoryLoader.dirs:1: WARNING: more than one target found for cross-reference 'Directory': swh.fuse.fs.artifact.Directory, swh.model.from_disk.Directory, swh.model.model.Directory vlorentz explained the cause: > SvnDirectoryLoader inherits from BaseDirectoryLoader which inherits > from NodeLoader, which defines: > > self.snapshot: Optional[Snapshot] = None > > and it loses the annotation's value (only keeps its string > representation) because of the inheritence: > https://github.com/sphinx-doc/sphinx/issues/10124 In order to fix this, we now use a qualified type reference in the initializers of NodeLoader and BaseDirectoryLoader.
-
- May 31, 2023
-
-
Antoine R. Dumont authored
The current and default snapshot built is fine for the TarballDirectoryLoader. We may have to improve the snapshot for directory coming from VCS (where they can be associated to "tag" or something else as well). Refs. swh/meta#4979
-
Antoine R. Dumont authored
This deduplicates current duplicated function in loader git and svn tests. Refs. swh/meta#4979
-
- May 30, 2023
-
-
Antoine R. Dumont authored
Refs. swh/meta#4979
-
Antoine R. Dumont authored
This moves the path mangling in tarball directory loader. This is the root cause of an issue in the loader git directory's current implementation. Refs. swh/meta#4979
-
- May 25, 2023
-
-
Antoine R. Dumont authored
Refs. swh/meta#4979
-
Antoine R. Dumont authored
The former is the base class and the latter is one specific implementation for tarball. Previously it was a single class. This explicits the current use for tarball directories. This will also allow to declare other directory loader implementations to deal with other origin types (vcs 'tree' as git, svn, hg). For this, developers need to provide a new class implementation per directory types. This class needs to inherit from BaseDirectoryLoader, and provide only the `fetch_directory` method. Now the BaseDirectoryLoader class is an abstract class and TarballDirectoryLoader is the first implementation in charge of ingesting 'directory' coming from a tarball. Further MRs will be opened to deal with Directory coming from Git, Hg or Svn in their respective loader package. Refs. swh/meta#4979
-
- May 15, 2023
-
-
vlorentz authored
"Changed in version 3.11: The population must be a sequence. Automatic conversion of sets to lists is no longer supported." -- https://docs.python.org/3/library/random.html#random.sample
-
- May 05, 2023
- Apr 26, 2023
-
-
Antoine R. Dumont authored
NodeLoader as in {Content|Directory}Loader. Those can directly ingest respectively a (remote) file (as Content) or a (remote) tarball artifact (as Directory). They are currently listed by the nixguix lister. Depending on the checksum_layout (standard, nar), they will compute checksums differently. With "standard" checksum layout, they are computing usual "swh" checksums (sha1, sha1_git, ...) and validate them. With "nar" checksum layout, they will compute the "nar" checksums and store those (once validated) as ExtID. This keeps the node loader construtor retro-compatible with the previous version. It still deals with `checksums_computation` as `checksum_layout` and falls back to "standard" layout when nothing is provided. Refs. swh/meta#4979
-
- Apr 25, 2023
-
-
Antoine R. Dumont authored
After validating the nar integrity checksums provided by the lister, this now also stores those as ExtID entries (as much as there are checksum hashes provided at initialization time). This behavior is shared by both Content and Directory loaders. This also only occurs when the `checksums_computation` constructor parameter is set to "nar". Refs. swh/meta#4979
-
Antoine R. Dumont authored
The implementation no longer requires such binary Refs. swh/meta#4979
-
- Apr 13, 2023
-
-
Antoine Lambert authored
The http_retry decorator from swh-lister has been moved to swh-core so we can now use it in swh-loader-core instead of duplicating retry code. Moreover, it also enables to retry HTTP requests on errors like 502, 503 or 504 instead on simply retrying on 429.
-
- Apr 06, 2023
-
-
Antoine Lambert authored
Previously any package loader that produced a snapshot for an origin with dangling branches in it fails to load that origin again as None check was missing when processing the last snapshot. Fix #3449.
-
- Apr 05, 2023
-
-
Antoine Lambert authored
Some packages can have an invalid version string in their changelog which raises a ValueError when attempting to parse it in the get_intrinsic_package_metadata method of the loader. As a consequence each package release with such bogus entry in the changelog was discarded from the snapshot created by the loader. So prefer to get the raw string version instead of parsing it to workaround that issue. Fix #1493.
-
Antoine R. Dumont authored
This pushes options from cli to the Nar class. This also bootstraps tests reusing existing test cases from before (using the nix binary). Refs. swh/meta#4979
-
Antoine R. Dumont authored
This also exposes a `swh nar` cli. ``` $ Usage: swh nar [OPTIONS] DIRECTORY Compute NAR hashes on a directory. Options: -x, --exclude-vcs exclude version control directories -H, --hash-algo [sha256|sha1] -f, --format-output [hex|base32|base64] --debug / --no-debug -h, --help Show this message and exit. ``` Refs. swh/meta#4979
-
- Apr 04, 2023
-
-
Kumar Shivendu authored
-
-
- Mar 27, 2023
-
-
Jérémy Bobbio (Lunar) authored
After switching to the PyData Sphinx theme, the very large table with the loader specification became too wide to be readable. These changes make the table scrollable again, remove the right sidebar, highlight the name column and add stripe to rows. They are stopgap measure as such a table might not the best way to present this information.
-
- Mar 08, 2023
-
-
Antoine Lambert authored
Version number of a pubdev package containing a dash character fails to be parsed by packaging.parse_version. So ensure to split version number by dash to extract parsable part in it and fix loading of such packages. Resolves #4741
-
- Feb 23, 2023
-
-
Jérémy Bobbio (Lunar) authored
GitLab will display the content of the README file when browsing the repository. But in case the file is a symlink, it will display the path pointed by the symlink. There is a 6 year old issue about this: https://gitlab.com/gitlab-org/gitlab/-/issues/15093 We can workaround the issue by having the content at the root of the repository and a symlink to this file in the `docs/` directory. Tested in swh/devel/swh-py-template!27
-