Skip to content
Snippets Groups Projects
  1. Nov 04, 2022
    • Nicolas Dandrimont's avatar
      Implement discovering branch targets from the archive · 92d9ada9
      Nicolas Dandrimont authored
      With the proper implementation of packfile negotiation, remotes can
      return packfiles that do not contain all of our wanted objects. Consider
      the following two histories:
      
      * c1                                      * c1     ← [refs/tags/original]
      ↑                                         ↑ :arrow_upper_left:
      * c2 ← [refs/heads/main]                  |   * c3 ← [refs/heads/main]
                                                * c2     ← [refs/heads/broken]
      
      The first visit of the origin would load commits c1 and c2, and write a
      snapshot referencing c2.
      
      During the second visit, the loader would tell the origin that c2 is
      known, and that c1 and c3 are wanted (as new heads). The origin, knowing
      that c1 is a parent of c2, would be allowed by the git protocol to send
      a packfile containing only c3. Under these circumstances, the loader
      cannot tell what object type the snapshot branch
      [refs/tags/original] should point to.
      
      The repository in tests has a similar structure ([refs/heads/master] is
      in the history of [refs/tags/branch2-before-delete]), so refactor the
      incremental load test to exercise this specific behavior. This test can
      be moved to the common tests as well.
      v2.1.0
      92d9ada9
    • Nicolas Dandrimont's avatar
      dumb loader: also filter the symbolic refs · e7988153
      Nicolas Dandrimont authored
      Even though this is only HEAD, we should make sure that it's filtered anyway.
      e7988153
    • Nicolas Dandrimont's avatar
      Make utils.filter_refs accept {bytes: bytes} and {bytes: HexBytes} · c2ed09e0
      Nicolas Dandrimont authored
      In terms of mypy, this function is just doing some types-washing anyway.
      c2ed09e0
    • Nicolas Dandrimont's avatar
      Eagerly populate the set of local heads in RepoRepresentation.__init__ · 3cf7582a
      Nicolas Dandrimont authored
      As dulwich's client.fetch_pack expects an instance of history graph
      walker with set of known heads, move the local heads caching from
      `determine_wants` to the RepoRepresentation initialization logic.
      
      Our previous code would always initialize the graph walker with an empty
      set of heads (as the `graph_walker()` method is called before
      `determine_wants()` has run, so `self.heads` was always empty), so we
      would never actually fetch an incremental pack file.
      v2.0.0
      3cf7582a
  2. Nov 03, 2022
  3. Oct 31, 2022
  4. Oct 25, 2022
  5. Oct 19, 2022
  6. Oct 18, 2022
  7. Jul 19, 2022
  8. Jun 16, 2022
  9. May 24, 2022
  10. May 20, 2022
  11. May 17, 2022
  12. May 16, 2022
  13. May 13, 2022
    • vlorentz's avatar
      Use all base snapshots in determine_wants() · 9b47b24b
      vlorentz authored
      Before this commit, determine_wants() used the origin's last snapshot
      if any, or the closest parent's snapshot if not.
      
      However, we noticed that many repositories that are very slow to load
      are forks that were already visited, but their owner rebased it on the
      parent since the last visit, causing potentially many commits to be
      added to the origin.
      
      This ensures we do not needlessly fetch these new commits when we
      already loaded the parent.
      v1.8.0
      9b47b24b
  14. May 06, 2022
  15. May 02, 2022
  16. Apr 27, 2022
    • vlorentz's avatar
      Replace 'base_url' argument with 'self.parent_origins' attribute · 4ede7b35
      vlorentz authored
      self.parent_origins is set dynamically by the core loader at the
      beginning of the load (before calling `prepare()`), using the right
      metadata loaders.
      v1.6.0
      4ede7b35
    • Antoine Lambert's avatar
      tasks: Simplify implementation and make visit_date parameter optional · 05242cd4
      Antoine Lambert authored
      Recent changes in swh-scheduler add new parameters to the celery tasks
      produced from swh.scheduler.model.ListedOrigin instances.
      
      So ensure to handle any new parameters by not hardcoding the expected
      ones in task signatures.
      
      Rename date parameter to visit_date in from disk loader tasks and
      make it non mandatory.
      
      Add new tests checking task parameters produced from ListedOrigin
      instances do no raise error when attempting to create a git loader.
      
      Related to T4187
      05242cd4
  17. Apr 26, 2022
  18. Apr 21, 2022
  19. Apr 20, 2022
  20. Apr 08, 2022
  21. Mar 22, 2022
    • Antoine Lambert's avatar
      pytest: Exclude build directory for tests discovery · 96987c2c
      Antoine Lambert authored
      Due to test modules being copied in subdirectories of the
      build directory by setuptools, it makes pytest fail by raising
      ImportPathMismatchError exceptions when invoked from root
      directory of the module.
      
      So ignore the build folder to discover tests.
      96987c2c
  22. Feb 10, 2022
  23. Jan 21, 2022
  24. Jan 14, 2022
  25. Jan 11, 2022
Loading