- Jun 23, 2023
-
-
Antoine Lambert authored
Missing docstring prevents the task type to be registered in scheduler database.
-
Antoine Lambert authored
Pagure is a git-centered forge, python based using pygit2. Its REST API enables to easily list all projects hosted in an instance so the lister implementation is quite simple. Related to swh/meta#5043.
-
- Jun 21, 2023
-
-
Nicolas Dandrimont authored
The default behavior of subprocess is to pull executables from a hardcoded list, which doesn't work when opam is installed manually in the user's home directory.
-
Nicolas Dandrimont authored
mypy doesn't catch that multiple uses of `self.listed_origins[origin_url]` in the same statement should be identical. Using a temporary local variable for it seems to help.
-
- Jun 20, 2023
-
-
vlorentz authored
The files we use weigh 440MB, and there are ~600MB of files we don't use
-
- Jun 08, 2023
-
-
Antoine R. Dumont authored
For the ones coming from a tarball. This matches the change happened in the associated directory loader. Refs. swh/infra/sysadm-environment#4906
-
- Jun 07, 2023
-
-
Antoine R. Dumont authored
Without this, the loader will fail. Refs. swh/meta#4979
-
- Jun 05, 2023
-
-
Antoine R. Dumont authored
Prior to this, it was sending only 'directory' types for all vcs trees. Multiple directory loaders now exist whose visit type are currently diverging, so the scheduling would not happen correctly without it. This commit is the required adaptation for the scheduling to work appropriately. Refs. swh/meta#4979
-
- May 31, 2023
-
-
Antoine R. Dumont authored
Those will be ingested by the loader as "directory" with "nar" checksum layouts. Refs. swh/infra/sysadm-environment#4868 Refs. swh/meta#4979
-
- May 23, 2023
-
-
Antoine R. Dumont authored
Some cgit instances are at a domain's root path so we can build their url directly from their 'instance' parameter. This unifies further the cli to register a lister and the cli to schedule the listed origins from a forge. [1] ``` https://git.kernel.org https://source.codeaurora.org https://git.trueelena.org https://dev.sanctum.geek.nz https://git.trueelena.org https://git.dpkg.org https://anongit.mindrot.org https://git.aurel32.net https://gitweb.gentoo.org https://git.joeyh.name https://git.adrian.geek.nz ``` Refs. #4693
-
- May 19, 2023
-
-
Antoine R. Dumont authored
This pushes the rather elementary logic within the lister's scope. This will simplify and unify cli call between lister and scheduler clis. This will also allow to reduce erroneous operations which can happen for example in the add-forge-now. With the following, we will only have to provide the type and the instance, then everything will be scheduled properly. Refs. #4693
-
- May 10, 2023
-
-
vlorentz authored
``` $ swh lister run Traceback (most recent call last): File "/home/dev/.local/bin/swh", line 33, in <module> sys.exit(load_entry_point('swh.core', 'console_scripts', 'swh')()) File "/home/dev/swh-environment/swh-core/swh/core/cli/__init__.py", line 144, in main return swh(auto_envvar_prefix="SWH") File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/dev/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/home/dev/.local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs) File "/home/dev/swh-environment/swh-lister/swh/lister/cli.py", line 68, in run get_lister(lister, **config).run() File "/home/dev/swh-environment/swh-lister/swh/lister/__init__.py", line 75, in get_lister raise ValueError( ValueError: Invalid lister None: only supported listers are ['arch', 'aur', 'bitbucket', 'bower', 'cgit', 'conda', 'cpan', 'cran', 'crates', 'debian', 'fedora', 'gitea', 'github', 'gitlab', 'gnu', 'gogs', 'golang', 'hackage', 'hex', 'launchpad', 'maven', 'nixguix', 'npm', 'nuget', 'opam', 'packagist', 'phabricator', 'pubdev', 'puppet', 'pypi', 'rubygems', 'sourceforge', 'tuleap'] ```
-
- Apr 27, 2023
-
-
Antoine R. Dumont authored
Starting with the first url. As soon as one detection succeeds, this stops and yields the result. Otherwise, continue with the detection on the next mirror url. This should fix the current misbehavior [1] when multiple mirror urls are not ok but the first one is. [1] swh/infra/sysadm-environment#4868 (comment 137483) Refs. swh/infra/sysadm-environment#4868
-
- Apr 26, 2023
-
-
Antoine R. Dumont authored
Refs. swh/meta#4979
-
- Apr 13, 2023
-
-
Antoine Lambert authored
The http_retry decorator has been moved to swh-core package in order to ease its reuse across swh packages.
-
- Mar 22, 2023
-
-
- Mar 21, 2023
-
-
Antoine Lambert authored
Instead of fully consuming the get_origins_from_page generator into a list and truncate it, prefer to consume the generator origin per origin and abort the process when the max number of origin per page is reached. Indeed some non trivial listers like the cgit one can perform costly processing, HTTP request for instance, for each origin in a page. So better not consuming the full generator in a row to avoid such side effects.
-
Antoine R. Dumont authored
This unifies with other lister tasks modules. And this allow the cgit task to be scheduled by the add-forge-now scheduler cli. Refs. swh/infra/sysadm-environment#4813
-
- Mar 14, 2023
-
-
- Mar 10, 2023
-
-
Antoine Lambert authored
Some URLs of the repositories endpoint from BitBucket REST API 2.0 can return an error 500. In that case, skip the buggy repositories page and get next one to continue listing and avoid to end it prematurely. Related to #4239
-
- Mar 09, 2023
-
-
Antoine Lambert authored
-
- Feb 17, 2023
-
-
Antoine Lambert authored
Related to swh/meta#4960
-
- Feb 16, 2023
-
-
Jérémy Bobbio (Lunar) authored
Related to swh/meta#4959
-
- Feb 10, 2023
-
- Feb 02, 2023
-
-
Antoine Lambert authored
This fixes python 3.7 support due to poetry, a dependency of isort, that removed support for that Python version in a recent release.
-
- Jan 02, 2023
-
-
Antoine Lambert authored
requests_ratelimited fixture from swh-core was renamed to github_requests_ratelimited. remaining_requests parameter was added to the github_response_callback function from swh-core, making it no longer compatible with requests_mock callback for json responses.
-
Antoine Lambert authored
-
- Dec 19, 2022
-
-
Antoine Lambert authored
In order to remove warnings about /apidoc/*.rst files being included multiple times in toc when building full swh documentation, prefer to include module indices only when building standalone package documentation. Also include them the proper sphinx way. Related to T4496
-
- Dec 14, 2022
-
-
- Dec 05, 2022
-
-
Nicolas Dandrimont authored
Hopefully one day we'll be able to replace all of this mess with PEP692 TypedDict kwargs, but that's only on track for Python 3.12.
-
Nicolas Dandrimont authored
Some GitLab instances use specific namespaces for transient repositories that it doesn't make sense to archive (for example, gitlab.org has a set of QA namespaces used for integration testing of their production deployments; drupal has an `issues/` namespace with forks of repos that are only used for collaboration on merge requests, and aren't that useful to be archived).
-
Nicolas Dandrimont authored
This cuts down one more manual step in the add forge now validation process: we can add the relevant origins to the staging scheduler without enabling them at all.
-
Nicolas Dandrimont authored
This will allow more automation of the staging add forge now process: for known-good listers, we can limit the number of origins being processed and reduce the amount of manual steps taken for each instance.
-
Nicolas Dandrimont authored
The SQL dump contains ownership instructions that can't be run if you don't have the right users in your database clusters. When someone has a psqlrc with ON_ERROR_STOP, this fails the load of the dump. Use the opportunity to trigger an exception when psql returns a non-zero exit code, rather than continue with an empty/inconsistent database.
-
- Nov 21, 2022
-
-
Antoine Lambert authored
In a similar way to the debian lister, use the following versions in the packages dictionary provided to the generic rpm loader: - dict keys are package versions prefixed by the fedora release and edition they have been found (fedora{release}/{edition}/{version}), they will be used as branch names targeting releases in the snapshot created by the rpm loader - version fields in dict values are the package intrinsic versions parsed from package repository metadata excluding any ".fcXY" suffixes to avoid the loader to create multiple releases targeting the same directory, they will be used as release names in the snapshot created by the rpm loader Related to T4448
-