Skip to content
Snippets Groups Projects
  1. May 07, 2021
  2. May 06, 2021
    • Raphaël Gomès's avatar
      Make the SourceForge lister incremental · 3baf1d09
      Raphaël Gomès authored
      SourceForge's sitemaps (1 main one + many sharded) give us a "last
      modified" date for every subsitemap and project, allowing us to perform
      an incremental listing.
      
      We store the subsitemaps' "last modified" dates in the lister state, as
      well as those of the empty projects (projects which don't have any VCS
      registered), and the rest comes from the already visited origins from
      the database.
      
      The tests try to cover the possible cases of a subsitemap that has
      changed, one that hasn't, a project that has change, one that hasn't,
      and same for an empty project.
      v1.2.0
      3baf1d09
  3. Apr 28, 2021
    • Antoine Lambert's avatar
      tox: Add sphinx environments to check sane doc build · 6f8dd5d3
      Antoine Lambert authored
      Enable to check package documentation can be built without producing
      sphinx warnings.
      
      The sphinx environment is designed to be used in continuous integration
      in order to prevent breaking documentation build when committing changes.
      
      The sphinx-dev environment is designed to be used inside a full swh
      development environment.
      
      Related to T3258
      v1.1.0
      6f8dd5d3
  4. Apr 27, 2021
    • vlorentz's avatar
      s/REST( API)?/API/ · 18b68bd8
      vlorentz authored
      Bitbucket's API kind of supports REST workflows, but the clearly use it
      like an RPC API (the hardcoded schema in `PROJECT_API_URL_FORMAT`
      make it particularly clear)
      18b68bd8
  5. Apr 13, 2021
  6. Apr 04, 2021
  7. Mar 23, 2021
    • Raphaël Gomès's avatar
      Add a non-incremental sourceforge lister · f7b27c69
      Raphaël Gomès authored
      Following zack's work on T735, this change introduces an actual SWH lister for
      SourceForge.
      
      SourceForge provides a main sitemap that lists sharded sitemaps, which
      themselves list pages. Each page belongs to a project (or sub-project,
      though those are rare), information about which can be found by querying
      a REST API, which gives us the list of any and all VCS used for said
      project. Both sitemaps and pages have a "last modified" timestamp that
      will be used in a future patch to implement incremental listing.
      
      More precise information can be found as inline comments or docstrings.
      f7b27c69
  8. Mar 19, 2021
  9. Feb 26, 2021
  10. Feb 08, 2021
  11. Feb 05, 2021
  12. Feb 02, 2021
    • Antoine Lambert's avatar
    • Antoine Lambert's avatar
      Remove no longer used legacy Lister API and update CLI options · 89335445
      Antoine Lambert authored
      Legacy Lister classes from the swh.lister.core mdule are no longer
      used in swh-lister codebase so it is time to remove them.
      
      Also remove lister CLI options related to legacy Lister API.
      
      As a consequence, the following requirements are no longer needed:
      arrow, SQLAlchemy, sqlalchemy-stubs and testing.postgresql.
      
      Closes T2442
      89335445
    • Antoine Lambert's avatar
      packagist: Reimplement lister using new Lister API · ff05191b
      Antoine Lambert authored
      The previous implementation was generating tasks for a non implemented
      Packagist loader.
      
      The new implementation extracts source repository URL, VCS type and
      last update date for each package referenced by Packagist and send
      those info to the scheduler.
      
      Packages metadata are retrieved using Packagist API endpoints whose
      responses are served from static files, which are guaranteed to be
      efficient on the Packagist side (no dymamic queries).
      Furthermore, subsequent listing will send the "If-Modified-Since" HTTP
      header to only retrieve packages metadata updated since the previous
      listing operation in order to save bandwidth and return only origins
      which might have new released versions.
      
      Closes T2991
      ff05191b
    • Antoine Lambert's avatar
      gnu: Remove dependency on pytz · 82ab96ad
      Antoine Lambert authored
      UTC timezone settings can be obtained from the datetime.timezone
      module from Python standard library so remove dependency on external
      pytz module.
      82ab96ad
  13. Feb 01, 2021
  14. Jan 29, 2021
  15. Jan 28, 2021
Loading