- Jul 10, 2023
-
-
Antoine R. Dumont authored
Depending on some instances, we have some specific heuristics, some instances: - have summary pages which do not not list metadata_url (so some computation happens to list git:// origins which are cloneable) - have summary page which reference metadata_url as a multiple comma separated urls - lists relative urls of the repository so we need to join it with the main instance url to have a complete cloneable origins (or summary page) - lists "down" http origins (cloning those won't work) so lists those as cloneable https ones (when the main url is behind https). Refs. swh/devel/swh-lister#1800
-
- Jun 23, 2023
-
-
Antoine Lambert authored
Pagure is a git-centered forge, python based using pygit2. Its REST API enables to easily list all projects hosted in an instance so the lister implementation is quite simple. Related to swh/meta#5043.
-
- Mar 14, 2023
-
-
- Nov 15, 2022
-
-
Kumar Shivendu authored
Summary: Lister to ingest fedora mirrors (.rpm) Reviewers: #reviewers, vlorentz Subscribers: vlorentz, olasd Maniphest Tasks: T4448 Differential Revision: https://forge.softwareheritage.org/D8386
-
- Oct 03, 2022
-
-
Antoine R. Dumont authored
Related to T3781
-
- Sep 29, 2022
-
-
- Sep 27, 2022
-
-
Franck Bret authored
Related T1718
-
Franck Bret authored
The puppet lister retrieves origins from from https://forge.puppet.com/modules Related T4519
-
Franck Bret authored
Related T2833
-
Franck Bret authored
Use http api point to get package names and build origin urls.
-
Franck Bret authored
Related T4547
-
- Aug 30, 2022
-
-
Raphaël Gomès authored
This uses https://index.golang.org. An associated loader will be sent in the near future, as well as an incremental version of this lister. [1] https://go.dev/ref/mod#goproxy-protocol
-
- Aug 29, 2022
-
-
Franck Bret authored
-
- Aug 26, 2022
-
-
Franck Bret authored
Stateless lister for https://pub.dev based on http api to list package names
-
- Aug 19, 2022
-
-
Franck Bret authored
Add 'aur' module to swh-lister with data fixtures and tests. For now, origin url are package vcs (Git) url.
-
- Aug 03, 2022
-
-
Kumar Shivendu authored
-
- Jun 15, 2022
-
-
Franck Bret authored
After a first attempt with D7812 this one use a different strategy to retrieve origins. Fetch and extract "core.files.tar.gz", "extra.files.tar.gz" and "community.files.tar.gz" from archives.archlinux.org. That step ensure that we have a list of "official" packages. Parse metadata from 'desc' file to build origins url. Scrap the origin url to get artifacts metadata that list all versions of a package. It also fetch and extract unofficial 'arm' packages from archlinuxarm.org but in this case we can not get all versions of an arm package. Related T4233
-
- Mar 28, 2022
-
-
Franck Bret authored
The Crates lister retrieves crates package for Rust lang. It basically fetches https://github.com/rust-lang/crates.io-index.git to a temp directory and then walks through each file to get the crate's info.
-
- Nov 29, 2021
-
-
Boris Baldassari authored
The Maven lister retrieves the maven central indexes, exports them in a convenient text format, and parse them to identify all src archives and pom files in the maven repository. Then the pom files are downloaded and analysed to find and yield any scm reference. Note: This is a new version of the maven lister diff D6133 which takes into account the initial round of reviews. Related to T1724
-
- Jul 06, 2021
-
-
zapashcanon authored
-
- May 26, 2021
-
-
Boris Baldassari authored
tuleap-lister: fix args in test_task. tuleap-lister: Add rate-limiting test + fix debug and typo. tuleap-lister: code review: fix mocker + tests/setup_cli. tuleap-lister: code review: fix relister > lister. tuleap-lister: code review: fix test_task kwargs. tuleap-lister: code review: Remove authentication useless lines + fix typos. tuleap-lister: code review: improve results_simplified for svn repos. tuleap-lister: code review: add name to CONTRIBUTORS file. tuleap-lister: code review: Update tutorial for misc files to edit. tuleap-lister: code review: Update copyright to 2021 exactly. tuleap-lister: code review: Update py files perms -X. tuleap-lister: code review: minimise json files. tuleap-lister: code review: fix chmod on json files. tuleap-lister: code review: fix var names + add tests. tuleap-lister: code review: fix useless indirection. tuleap-lister: code review: Add empty repo test, minor typo fixes.
-
- Mar 23, 2021
-
-
Raphaël Gomès authored
Following zack's work on T735, this change introduces an actual SWH lister for SourceForge. SourceForge provides a main sitemap that lists sharded sitemaps, which themselves list pages. Each page belongs to a project (or sub-project, though those are rare), information about which can be found by querying a REST API, which gives us the list of any and all VCS used for said project. Both sitemaps and pages have a "last modified" timestamp that will be used in a future patch to implement incremental listing. More precise information can be found as inline comments or docstrings.
-
- Sep 23, 2020
-
-
David Douard authored
-
- Sep 17, 2020
-
-
Antoine Lambert authored
Related to T2610
-
- Aug 25, 2020
-
-
David Douard authored
-
- Jun 25, 2020
-
-
Nicolas Dandrimont authored
-
- Jun 10, 2020
-
-
Summary: Lister implementation for Gitea, works for (T2313). For now because of https://github.com/go-gitea/gitea/issues/9165 it would require setting its param limit to 50. Reviewers: #reviewers, ardumont Reviewed By: #reviewers, ardumont Subscribers: ardumont Differential Revision: https://forge.softwareheritage.org/D3107
-
- Apr 29, 2020
-
-
Stefano Zacchiroli authored
-
- Apr 20, 2020
-
-
Antoine R. Dumont authored
Related to T2367
-
- Apr 11, 2020
-
-
Léni Gauffier authored
Summary: Related to T1734 From abandonned D2799 Reviewers: ardumont Reviewed By: ardumont Differential Revision: https://forge.softwareheritage.org/D2974
-
- Apr 08, 2020
-
-
David Douard authored
- blackify all the python files, - enable black in pre-commit, - add a black tox environment.
-
- Nov 22, 2019
-
-
Nicolas Dandrimont authored
-
- Sep 20, 2019
-
-
Antoine R. Dumont authored
Prior to this commit, the pip activation environment failed because the old cli name no longer exists, it's named 'lister' now.
-
- Sep 03, 2019
-
-
David Douard authored
Listers are declared as plugins via the `swh.workers` entry_point. As such, the registry function is expected to return a dict with the `task_modules` field (as for generic worker plugins), plus: - `lister`: the lister class, - `models`: list of SQLAlchemy models used by this lister, - `init` (optionnal): hook (callable) used to initialize the lister's state (typically, create/initialize the database for this lister). If not set, the default implementation creates database tables (after optionally having deleted exisintg ones) according to models declared in the `models` register field. There is no need for explicitely add lister task modules in the main `conftest` module, but any new/extra lister to be tested must be registered (the tested lister module must be properly installed in the test environment). Also refactor a bit the cli tools: - add support for the standard --config-file option at the 'lister' group level, - move the --db-url to the 'lister' group, - drop the --lister option for the `swh lister db-init` cli tool: initializing (especially with --drop-tables) the database for a single lister is unreliable, since all tables are created using a sibgle MetaData (in the same namespace).
-
- Jun 28, 2019
-
-
Antoine R. Dumont authored
-
- May 22, 2019
-
-
David Douard authored
also add a cli group named 'lister' for the sake of consistency with other swh packages and rename the command as 'db-init', like: swh lister db-init LISTER [...]
-
- Feb 06, 2019
-
-
David Douard authored
-
- Oct 30, 2018
-
-
David Douard authored
-
- Oct 23, 2018
-
-
David Douard authored
-