- Sep 21, 2021
-
-
Antoine R. Dumont authored
This matches how it's done for all other multi instances listers. Related to T3590
-
Antoine R. Dumont authored
We should avoid side-effects in the constructor as much as possible. That avoids surprising behavior at object instantiation time. The state if needed must be initialized into the `swh.lister.pattern.Lister.get_pages` method, as preconized in the class docstring. This also fixes the current test that actually bootstrap a real opam local "clone" in /tmp. Related to T3590
-
- Sep 17, 2021
-
-
Antoine R. Dumont authored
This will allow to dedicate the heptapod instances into its their own stats. Related to T3581
-
Antoine R. Dumont authored
Related to T3581#70593
-
- Sep 16, 2021
-
-
Antoine R. Dumont authored
This will allow to list the foss.heptapod.net instance for example. Related to T3581
-
- Jul 23, 2021
-
-
Antoine Lambert authored
GitLab API can return errors 500 when listing projects (see https://gitlab.com/gitlab-org/gitlab/-/issues/262629). To avoid ending the listing prematurely, skip buggy URLs and move to next pages. Related to T3442
-
Antoine Lambert authored
Increase number of origins per page to the maximum value allowed by GitLab API (100) to send less requests. Ask for simple responses to reduce size of JSON data.
-
Antoine Lambert authored
Temporarily server failures can happen when listing a GitLab instance, HTTP status codes 502, 503 or 520 are returned in that case. So adapt lister requests retry policy to execute requests again when such errors are encountered. Related to T3442
-
- Jul 20, 2021
-
-
Antoine R. Dumont authored
This aligns the behavior with the opam loader Related to T3358
-
- Jul 13, 2021
-
-
Antoine Lambert authored
Make the instance parameter of the base pattern lister optional and set lister name to URL network location when not provided. It simplifies lister creation when associated forge type have a lot of instances in the wild (e.g. gitlab or cgit) while giving more details about the listed forge instance. Also process listers for forge with multiple instances (cgit, gitea, gitlab, phabricator and tuleap) to ensure URL network location will be used when instance parameter is not provided. Related to T3403
-
- Jul 09, 2021
-
-
Antoine R. Dumont authored
This rewrote the current implementation to actually use pypi's xml-rpc api which allows to be incremental. It also allows to fetch the last release date per package. This last part actually make it possible to update the "last_update" entry in the ListedOrigin model. Related to T3399
-
Antoine R. Dumont authored
-
- Jul 06, 2021
-
-
zapashcanon authored
-
- Jun 09, 2021
-
-
Antoine Lambert authored
-
David Douard authored
-
- Jun 04, 2021
-
-
Raphaël Gomès authored
See inline comment as to why. This change also adds a Mercurial repo to the test data.
-
- Jun 03, 2021
-
-
Raphaël Gomès authored
I previously forgot to add the `https://` prefix to the cloning URL. Whoops.
-
- May 31, 2021
-
-
Antoine R. Dumont authored
This is a temporary workaround the time we make a first pass on those repositories. Related to T3350
-
- May 28, 2021
-
- May 26, 2021
-
-
Raphaël Gomès authored
Since this lister is doing a lot more requests than most other, it makes sense that issues would arise more often. We want the lister to continue even if the website is having issues and not break on the first 500 or closed connection it encounters. This change introduces a mechanism to retry all exceptions worth retrying and uses it for the SourceForge lister. Other listers might benefit from this, but this is out of scope here. Tests had to be adjusted to stub the sleep function since retries happened way more often.
-
Boris Baldassari authored
tuleap-lister: fix args in test_task. tuleap-lister: Add rate-limiting test + fix debug and typo. tuleap-lister: code review: fix mocker + tests/setup_cli. tuleap-lister: code review: fix relister > lister. tuleap-lister: code review: fix test_task kwargs. tuleap-lister: code review: Remove authentication useless lines + fix typos. tuleap-lister: code review: improve results_simplified for svn repos. tuleap-lister: code review: add name to CONTRIBUTORS file. tuleap-lister: code review: Update tutorial for misc files to edit. tuleap-lister: code review: Update copyright to 2021 exactly. tuleap-lister: code review: Update py files perms -X. tuleap-lister: code review: minimise json files. tuleap-lister: code review: fix chmod on json files. tuleap-lister: code review: fix var names + add tests. tuleap-lister: code review: fix useless indirection. tuleap-lister: code review: Add empty repo test, minor typo fixes.
-
- May 12, 2021
-
-
Raphaël Gomès authored
It's suboptimal to say the least to stop the entire lister process if a single project page is somehow broken (404, most likely). This change logs the issue as a warning and carries on, as well as some minor logging changes and comments touch ups.
-
- May 07, 2021
-
-
Antoine R. Dumont authored
Related to T3310
-
Antoine R. Dumont authored
The credentials parameter is not optional due to the instance constructor logic. Even if unused, this must be provided to the lister (from the task standpoint). Related to T3310#64801
-
Antoine Lambert authored
This ensures the mocked sleep will work with all tenacity versions. Related to T3310
-
Antoine Lambert authored
It fixes debian package build of swh-lister on buster.
-
- May 06, 2021
-
-
Raphaël Gomès authored
SourceForge's sitemaps (1 main one + many sharded) give us a "last modified" date for every subsitemap and project, allowing us to perform an incremental listing. We store the subsitemaps' "last modified" dates in the lister state, as well as those of the empty projects (projects which don't have any VCS registered), and the rest comes from the already visited origins from the database. The tests try to cover the possible cases of a subsitemap that has changed, one that hasn't, a project that has change, one that hasn't, and same for an empty project.
-
- Apr 28, 2021
-
-
Antoine Lambert authored
Enable to check package documentation can be built without producing sphinx warnings. The sphinx environment is designed to be used in continuous integration in order to prevent breaking documentation build when committing changes. The sphinx-dev environment is designed to be used inside a full swh development environment. Related to T3258
-
- Apr 27, 2021
-
-
vlorentz authored
Bitbucket's API kind of supports REST workflows, but the clearly use it like an RPC API (the hardcoded schema in `PROJECT_API_URL_FORMAT` make it particularly clear)
-
- Apr 13, 2021
- Apr 04, 2021
-
-
Hezekiah Maina authored
-
Hezekiah Maina authored
-
- Mar 23, 2021
-
-
Raphaël Gomès authored
Following zack's work on T735, this change introduces an actual SWH lister for SourceForge. SourceForge provides a main sitemap that lists sharded sitemaps, which themselves list pages. Each page belongs to a project (or sub-project, though those are rare), information about which can be found by querying a REST API, which gives us the list of any and all VCS used for said project. Both sitemaps and pages have a "last modified" timestamp that will be used in a future patch to implement incremental listing. More precise information can be found as inline comments or docstrings.
-
- Mar 19, 2021
-
-
Nicolas Dandrimont authored
These errors happen, sometimes, when requesting large pages of results.
-
Nicolas Dandrimont authored
This makes the logic easier to test.
-
Nicolas Dandrimont authored
These happen, sometimes, when the connection to the GitHub server resets, e.g. because of congestion on a slow link.
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored