- Oct 03, 2022
-
-
Antoine R. Dumont authored
Related to T3781
-
Antoine R. Dumont authored
Related to T3781
-
Antoine R. Dumont authored
Related to T3781
-
- Sep 30, 2022
-
-
Antoine Lambert authored
In listers collecting artifacts for each package to load, add artifacts checksums, when that info is available, in parameters sent to loaders in order to check downloaded artifact integrity.
-
Franck Bret authored
'artifacts' extra_loader_arguments should be a list
-
- Sep 29, 2022
-
-
-
Antoine Lambert authored
Prefer to execute lister through a celery task as it also enables to catch possible issues with task implementation. Also use docker compose v2 commands.
-
Antoine Lambert authored
The base lister class now ensures the count of listed origins will be accurate.
-
Antoine Lambert authored
Previously, the run method was returning the total count of ListedOrigin objects sent to scheduler database. However, some listers can send multiple ListedOrigin objects for a given origin URL during the listing process, for instance when an origin is contained in multiple pages (e.g. gogs listing) or when the listing is gathering multiple versions of an origin spread across multiple pages (e.g. maven listing). This changes ensures an accurate count of listed origins by maintaining a set of origin URLs associated to the sent ListedOrigin objects.
-
- Sep 27, 2022
-
-
Franck Bret authored
Related T1718
-
Franck Bret authored
The puppet lister retrieves origins from from https://forge.puppet.com/modules Related T4519
-
Franck Bret authored
Related T2833
-
Franck Bret authored
Use http api point to get package names and build origin urls.
-
Franck Bret authored
Related T4547
-
- Sep 26, 2022
-
-
Antoine R. Dumont authored
With the extension, the readme is included in the swh-docs build and fails. It's not intended for the documentation build so renaming it keep it out of the doc build loop. This fixes build [1]. [1] https://jenkins.softwareheritage.org/view/all/job/DDOC/job/dev/2395/
-
Antoine Lambert authored
That HTTP header value will now contain the lister name but also a link to our contact form in order for sysadmins to easily reach us if needed. The following template is used to generate it: "Software Heritage <lister_name> lister v<swh-lister version> (+https://www.softwareheritage.org/contact)"
-
Antoine Lambert authored
Numerous listers were using the same page_request method or equivalent in their implementation so prefer to deduplicate that code by adding an http_request method in base lister class: swh.lister.pattern.Lister. That method simply wraps a call to requests.Session.request and logs some useful info for debugging and error reporting, also an HTTPError will be raised if a request ends up with an error. All listers using that new method now benefit of requests retry when an HTTP error occurs thanks to the use of the http_retry decorator.
-
Antoine Lambert authored
Instead of retrying HTTP requests only for 429 status code by default, prefer to use the generic retry policy enabling to also retry for status codes >= 500 but also on ConnectionError exceptions. Rename throttling_retry decorator to http_retry to reflect this change.
-
- Sep 20, 2022
-
-
Vincent Sellier authored
For some forges, the default tab for a repository detail is not the summary tab so the clone urls are not detected and the repository is ignored Related to T4544
-
Kumar Shivendu authored
This also affects the gitea lister
-
- Sep 19, 2022
-
-
Antoine Lambert authored
Align with other lister names by turning it to lowercase.
-
- Sep 13, 2022
-
-
Antoine Lambert authored
-
Antoine Lambert authored
It ensures created temporary directories will be removed once they are no longer needed.
-
- Sep 09, 2022
-
-
Antoine R. Dumont authored
This matches other lister verbosity. Related to T4517
-
- Sep 07, 2022
-
-
Antoine Lambert authored
Use a value that matches good practice recommended by pub.dev REST API doc. https://github.com/dart-lang/pub/blob/master/doc/repository-spec-v2.md
-
- Sep 02, 2022
-
-
Antoine Lambert authored
In order to get a last_update for each ListedOrigin sent to scheduler database, send an extra HTTP request for each listed package to the /api/packages/<package_name> endpoint of pub.dev API. A pub.dev developer inform us that endpoint is heavily used and cached so there is no particular issues to query that endpoint for each package in a row periodically.
-
Antoine Lambert authored
Use https://pub.dev/packages/<package_name> instead of https://pub.dev/api/packages/<package_name>
-
Antoine Lambert authored
It will enable to archive the history of the PKGBUILD file associated to the AUR package.
-
Antoine Lambert authored
Use https://aur.archlinux.org/packages/<package_name> instead of https://aur.archlinux.org/<package_name>.git
-
Antoine Lambert authored
Simplify code for downloading packages index as gzip and deflate transfer-encodings are automatically decoded by requests, also do not stream response for a couple of megabytes and store HTTP responses in memory. Also add more debug logs to track lister execution.
-
- Sep 01, 2022
-
-
Antoine Lambert authored
-
- Aug 30, 2022
-
-
Raphaël Gomès authored
-
Raphaël Gomès authored
This uses https://index.golang.org. An associated loader will be sent in the near future, as well as an incremental version of this lister. [1] https://go.dev/ref/mod#goproxy-protocol
-
Franck Bret authored
-
- Aug 29, 2022
-
-
Franck Bret authored
Origins url for Bower are git repositories. Set the VISIT_type as 'git'. No need for a specific 'Bower' package loader.
-
Franck Bret authored
-
- Aug 26, 2022
-
-
Franck Bret authored
Stateless lister for https://pub.dev based on http api to list package names
-
- Aug 25, 2022
- Aug 24, 2022
-
-
vlorentz authored
By using a single equality instead of checking len() then zip() to check one by one, pytest can find the common/missing elements and print them nicely when the two lists are unequal.
-