Skip to content
Snippets Groups Projects
  1. Mar 21, 2025
  2. Mar 13, 2025
  3. Feb 26, 2025
  4. Feb 25, 2025
  5. Feb 20, 2025
  6. Feb 17, 2025
  7. Feb 10, 2025
    • Antoine Lambert's avatar
      maven: Update test that is now failing since beautifulsoup4 4.13 · a3d66736
      Antoine Lambert authored
      Latest beautifulsoup4 release (4.13) seems to have fixed issues
      related to unexpected encodings in XML files so a test that was
      passing previously is now failing.
      
      Update that test to check origin URL and visit type can be
      successfully extracted from a POM file with unexpected encoding.
      a3d66736
  8. Jan 22, 2025
  9. Dec 11, 2024
  10. Nov 07, 2024
  11. Oct 29, 2024
  12. Oct 28, 2024
  13. Oct 24, 2024
  14. Oct 14, 2024
  15. Sep 05, 2024
    • Antoine Lambert's avatar
      sourceforge: Also skip ConnectionError when fetching project info · 927aebbd
      Antoine Lambert authored
      The sourceforge lister sends various HTTP requests to get info about a
      project, for instance to get the branch name of a Bazaar project.
      
      If HTTP errors occurred during these steps, they were discarded in order
      for the listing to continue but connection errors were not and as a
      consequence the listing was failing when encountering such error.
      
      Currently, the legacy Bazaar project hosted on sourceforge seems down and
      connection  errors are raised when attempting to fetch branch names so the
      lister does not process all projects as it crashes in mid-flight.
      v6.8.0
      927aebbd
  16. Sep 04, 2024
    • Antoine Lambert's avatar
      Add save-bulk lister to check origins prior their insertion in database · af24960b
      Antoine Lambert authored
      This new and special lister enables to verify a list of origins to archive
      provided by users (for instance through the Web API).
      
      Its purpose is to avoid polluting the scheduler database with origins that
      cannot be loaded into the archive.
      
      Each origin is identified by an URL and a visit type. For a given visit type
      the lister is checking if the origin URL can be found and if the visit type
      is valid.
      
      The supported visit types are those for VCS (bzr, cvs, hg, git and svn) plus
      the one for loading a tarball content into the archive.
      
      Accepted origins are inserted or upserted in the scheduler database.
      
      Rejected origins are stored in the lister state.
      
      Related to #4709
      af24960b
  17. Sep 02, 2024
  18. Aug 27, 2024
  19. Jul 18, 2024
  20. Jun 28, 2024
  21. Jun 05, 2024
    • Antoine Lambert's avatar
      gitea, gogs: Ensure query parameters are not duplicated in API URLs · 323e2774
      Antoine Lambert authored
      Gitea API return next pagination link with all query parameters provided
      to an API request.
      
      As we were also passing a dict of fixed query parameters to the page_request
      method, some query parameters ended up having multiple instances in the URL
      for fetching a new page of repositories data. So each time a new page was
      requested, new instances of these parameters were appended to the URL which
      could result in a really long URL if the number of pages to retrieve is high
      and make the request fail.
      
      Also remove a debug log already present in http_request method.
      323e2774
  22. May 22, 2024
  23. Apr 24, 2024
  24. Apr 16, 2024
    • Antoine Lambert's avatar
      Use beautifulsoup4 CSS selectors to simplify code and type checking · 41407e0e
      Antoine Lambert authored
      As the types-beautifulsoup4 package gets installed in the swh virtualenv
      as it is a swh-scanner test dependency, some mypy errors were reported
      related to beautifulsoup4 typing.
      
      As the returned type for the find method of bs4 is the following union:
      Tag | NavigableString | None, isinstance calls must be used to ensure
      proper typing which is not great.
      
      So prefer to use the select_one method instead where a simple None check
      must be done to ensure typing is correct as it is returning Optional[Tag].
      In a similar manner, replace use of find_all method by select method.
      
      It also has the advantage to simplify the code.
      41407e0e
  25. Mar 29, 2024
  26. Mar 14, 2024
  27. Mar 13, 2024
Loading