- Jul 10, 2023
-
-
Antoine R. Dumont authored
Refs. swh/devel/swh-lister#1800
-
- Nov 15, 2022
-
-
Kumar Shivendu authored
Summary: Lister to ingest fedora mirrors (.rpm) Reviewers: #reviewers, vlorentz Subscribers: vlorentz, olasd Maniphest Tasks: T4448 Differential Revision: https://forge.softwareheritage.org/D8386
-
- Oct 07, 2022
-
-
Antoine Lambert authored
Instead of using an undocumented rubygems HTTP endpoint that only gives us the names of the gems, prefer to exploit the daily PostgreSQL dump of the rubygems.org database. It enables to list all gems but also all versions of a gem and its release artifacts. For each relase artifact, the following info are extracted: version, download URL, sha256 checksum, release date plus a couple of extra metadata. The lister will now set list of artifacts and list of metadata as extra loader arguments when sending a listed origin to the scheduler database. A last_update date is also computed which should ensure loading tasks for rubygems will be scheduled only when new releases are available since last loadings. To be noted, the lister will spawn a temporary postgres instance so this require the initdb executable from postgres server installation to be available in the execution environment. Related to T1777
-
- Aug 09, 2022
-
-
Antoine Lambert authored
xmltodict cannot parse POM files with multi-byte encoding so prefer to use the XML parser of BeautifulSoup based on lxml instead. Also drop xmltodict requirement as it is no longer used in swh-lister codebase.
-
- Aug 05, 2022
-
-
Franck Bret authored
Add incremental mode support based on a 'last_commit' state, used to get new package versions from git diff range of commits.
-
- Apr 21, 2022
-
-
Antoine Lambert authored
Fix sourceforge origin URL for bzr projects, http://project.bzr.sourceforge.net/bzrroot/project redirects to http://project.bzr.sourceforge.net/bzr/project. Handle bzr projects with multiple branches, one listed origin must be created per branch. Discard bzr projects that no longer exist from listing.
-
- Nov 29, 2021
-
-
Boris Baldassari authored
The Maven lister retrieves the maven central indexes, exports them in a convenient text format, and parse them to identify all src archives and pom files in the maven repository. Then the pom files are downloaded and analysed to find and yield any scm reference. Note: This is a new version of the maven lister diff D6133 which takes into account the initial round of reviews. Related to T1724
-
- Feb 05, 2021
-
-
Antoine Lambert authored
xmltodict now raises an error while trying to parse the HTML content of https://pypi.org/simple/ page. So use BeautifulSoup HTML parser instead as it is aleady a requirement of swh-lister and it does not fail parsing the PyPI HTML page. Also drop no longer used xmltodict in requirements.
-
- Feb 02, 2021
-
-
Antoine Lambert authored
Legacy Lister classes from the swh.lister.core mdule are no longer used in swh-lister codebase so it is time to remove them. Also remove lister CLI options related to legacy Lister API. As a consequence, the following requirements are no longer needed: arrow, SQLAlchemy, sqlalchemy-stubs and testing.postgresql. Closes T2442
-
- Jan 28, 2021
-
-
Antoine Lambert authored
Port launchpad lister to the swh.lister.pattern.Lister API. Last update date of each listed git repositories is now sent to the scheduler. The lister can work in incremental mode, only modified repositories since the last listing operation will be returned in that case. Closes T2992
-
- Nov 14, 2019
-
-
Antoine R. Dumont authored
Related bb5d405
-
- Oct 28, 2019
-
-
Stefano Zacchiroli authored
-