- Jan 24, 2024
-
-
Nicolas Dandrimont authored
This hooks into the right urllib3 and requests settings for both the smart and dumb loader.
-
Nicolas Dandrimont authored
This sets the connect and read timeout for both the smart loader (via urllib3/dulwich) and for the dumb loader (via requests).
-
Nicolas Dandrimont authored
This is useful to override the default settings of the requests Session, e.g. certificate verification of connect/read timeouts.
-
Nicolas Dandrimont authored
This is useful to override the default settings of the dulwich urllib3 adapter, e.g. certificate verification of connect/read timeouts.
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
Git loading tasks can take a pretty long time, and it's not easy to diagnose if it's stuck or if it's just taking a while. Instead of only logging at the end of the task, print a log line after each object type has been fully processed. Also print a log line every 3 minutes while objects are being processed.
-
Nicolas Dandrimont authored
The packfile fetching operation can take a long time. Send one log line every minute while it progresses.
-
Nicolas Dandrimont authored
Instead of dumping the dulwich remote communication stream to stderr, add a separate logger for remote messages, and handle the remote stream as proper log entries.
-
Nicolas Dandrimont authored
Newer versions of git create a ".rev" file next to the existing ".pack" and ".idx", making the nb_files inconsistent.
-
- Jan 16, 2024
-
-
Antoine Lambert authored
A utility function was renamed in swh-loader-core.
-
Antoine Lambert authored
If the submodules parameter of the loader is True but no .gitmodules file is found in root directory of the repository, the repository path is not yielded and thus its loading discarded.
-
- Jan 08, 2024
-
-
Antoine Lambert authored
It indicates if submodules should be retrieved after the git checkout operation as some guix origins require it. Related to #4751.
-
- Dec 05, 2023
-
-
David Douard authored
-
- Dec 04, 2023
-
-
David Douard authored
-
- Dec 03, 2023
-
-
David Douard authored
-
- Nov 27, 2023
-
-
Jérémy Bobbio (Lunar) authored
-
- Nov 20, 2023
-
-
David Douard authored
Make it valid for pypi.
-
David Douard authored
Convert README from markdown to ReST to make it embeddable in docs/index.rst
-
- Nov 17, 2023
-
-
David Douard authored
Seems for some reason we do not want to install the package as editable any more now...
-
David Douard authored
-
David Douard authored
This later version changed the API of the directory filetering mechanism in BaseDirectoryLoader (path are now expected to be bytes).
-
- Oct 09, 2023
-
-
Antoine Lambert authored
It fixes some cases where the tag of interest was not fetched. Related to #4751
-
- Oct 05, 2023
-
-
Antoine Lambert authored
Ensure to remove trailing slash in git URL when computing its basename as an empty string is returned otherwise. When a shallow fetch failed, typically when the ref is a commit short hash, retry a full fetch in order for ref checkout to succeed. Related to #4751.
-
Antoine Lambert authored
It has been observed that the process used by SWH to checkout a remote git reference can lead to different recursive nar hash values compared to those computed by guix. This seems related to CR/LF normalization. So prefer to align the process to checkout a remote git reference with the one used by guix. It seems also faster than the previous approach. Also refine the not found repository detection process as previously some non related git errors could be missed. Related to #4751.
-
- Sep 18, 2023
-
-
Antoine Lambert authored
The git directory loader is used to archive guix source packages where source code is located in a git repository at a specific reference. To ensure SWH archives the exact same set of source code files for a guix package, the recursive NAR hash of the source code directory is computed and compared against the one computed by guix. Previously the loader was always fetching git submodules if some were set for the git repository but guix only fetch those for a couple of packages and not for all git based ones. This could result in directory hash mismatch when the loader fetches the submodules while it should have not. In order to woraround this, first compute the NAR hash without fetching submodules and if this results in a directory hash mismatch then retry the operation with the submodules fetched. Related to #4751.
-
- Aug 24, 2023
-
-
Antoine R. Dumont authored
Without this, some git clones are failing to be ingested because they referenced submodule which is not initialized. This results in hash mismatch since the git tree checkouted does not match the upstream nix/guix manifest. Refs. swh/devel/swh-loader-git#4751
-
- Aug 22, 2023
-
-
Antoine R. Dumont authored
-
- Aug 21, 2023
-
-
Antoine R. Dumont authored
Inspired from the pip cloning step [1]. This makes the cloning steps only fetch the commit information and the tree at the current heads. Then a subsequent switch (checkout) retrieves the tree at the reference we want. In effect, this retrieves way faster the necessary tree needed to ingest the repository. [1] it uses a blobless cloning though.
-
- Aug 07, 2023
-
-
Antoine R. Dumont authored
Refs. swh/meta#3781
-
- Jul 03, 2023
-
-
Antoine Lambert authored
Previous commit modified the dumb.check_protocol function to raise an HTTPError exception when the request to check dumb protocol support failed. As NotFound exception inherits from ValueError, the code for checking dumb protocol support was executed even when a repository was not found. So an HTTPError exception was raised with a 404 status code and the NotFound exception was no longer propagated to the base loader class, resulting in a failed visit status instead of a not_found one.
-
- Jun 14, 2023
-
-
Antoine Lambert authored
Some network issues can also happen when checking a git repository can be cloned using the dump protocol so add HTTP retry feature to the check_protocol function.
-
Antoine Lambert authored
-
- Jun 09, 2023
-
-
Antoine R. Dumont authored
Most cli uses the - as separator and not the _ `git_disk` is kept as is because it's old.
-
Antoine R. Dumont authored
This also fixes the git checkout related loader and task inconsistently named. Refs. swh/infra/sysadm-environment#4906
-
Antoine Lambert authored
It enables to use the loader through the following command. $ swh loader run git_checkout <url> ref=<ref> checkums=<checksums>
-
- Jun 07, 2023
-
-
Antoine R. Dumont authored
The 0.21.4.1 is actually broken.
-
- Jun 06, 2023
-
-
Antoine R. Dumont authored
Refs. swh/meta#4979
-
- Jun 05, 2023
-
-
Antoine R. Dumont authored
Otherwise, we'd lose the context in the snapshot. Refs. swh/meta#4979
-
Antoine R. Dumont authored
This unifies with other swh import.
-
Antoine R. Dumont authored
-