Skip to content
Snippets Groups Projects
  1. Oct 09, 2023
  2. Oct 05, 2023
  3. Sep 18, 2023
    • Antoine Lambert's avatar
      directory: Refine the way submodules are handled · f9c18e78
      Antoine Lambert authored
      The git directory loader is used to archive guix source packages where
      source code is located in a git repository at a specific reference.
      
      To ensure SWH archives the exact same set of source code files for a
      guix package, the recursive NAR hash of the source code directory is
      computed and compared against the one computed by guix.
      
      Previously the loader was always fetching git submodules if some were
      set for the git repository but guix only fetch those for a couple of
      packages and not for all git based ones. This could result in directory
      hash mismatch when the loader fetches the submodules while it should
      have not.
      
      In order to woraround this, first compute the NAR hash without fetching
      submodules and if this results in a directory hash mismatch then retry
      the operation with the submodules fetched.
      
      Related to #4751.
  4. Aug 24, 2023
  5. Aug 22, 2023
  6. Aug 21, 2023
  7. Aug 07, 2023
  8. Jul 03, 2023
    • Antoine Lambert's avatar
      loader: Ensure NotFound exception is reraised when caught · 3b18c155
      Antoine Lambert authored
      Previous commit modified the dumb.check_protocol function to raise an
      HTTPError exception when the request to check dumb protocol support
      failed. As NotFound exception inherits from ValueError, the code for
      checking dumb protocol support was executed even when a repository was
      not found. So an HTTPError exception was raised with a 404 status code
      and the NotFound exception was no longer propagated to the base loader
      class, resulting in a failed visit status instead of a not_found one.
  9. Jun 14, 2023
  10. Jun 09, 2023
  11. Jun 07, 2023
  12. Jun 06, 2023
  13. Jun 05, 2023
  14. Jun 01, 2023
  15. May 05, 2023
  16. Apr 26, 2023
  17. Apr 20, 2023
    • Antoine Lambert's avatar
      loader: Check pack size of non archived github origin prior fetching it · 0474c01f
      Antoine Lambert authored
      GitHub API provides for each repository the pack file size in kibibytes
      corresponding to a full clone.
      
      As metadata for a GitHub repository are fetched at the beginning of the
      loading process (currently only for origins discovered by the github lister),
      parse their raw JSON bytes and store pack file size as a loader attribute.
      Then, before fetching the pack file for a github origin without any base
      snapshot in the archive, check the pack file size is not greater than the
      threshold defined by the loader. If it is the case, abort the loading in
      order to save some network bandwidth.
      
      Related to #3652
      0474c01f
  18. Apr 14, 2023
  19. Mar 15, 2023
  20. Mar 13, 2023
  21. Mar 08, 2023
  22. Feb 20, 2023
    • Antoine Lambert's avatar
      loader: Add statsd metric to count number of resolved external refs · bcaf4241
      Antoine Lambert authored
      Add a statsd counter to report the total number of git objects per type
      resolved from the archive but also the total number of objects that could
      not be resolved.
      
      Ensure counters are incremented exactly once per git object as the
      content of the pack file is iterated four times, one per git object
      type.
    • Antoine Lambert's avatar
      loader: Check resolved git objects from the archive are not corrupted · 4fe16a3e
      Antoine Lambert authored
      Ensure to check git objects resolved from the archive by the _resolve_ext_ref
      method are not corrupted to enforce a loading will fail when encountering such
      edge cases.
      4fe16a3e
    • Antoine Lambert's avatar
      loader: Implement resolve_ext_ref callback of PackInflater · 5d93f40a
      Antoine Lambert authored
      It exist cases where a git server can send a pack file containing external
      references to objects stored in the local repository. In that case, data
      for such objects are not included in the pack file and must be resolved
      from the local git object store.
      
      Such a pack file is typically generated when a git client asks for a thin
      pack to the server but can also be generated when a client is not requesting
      a thin pack.
      
      It has been observed that GitHub can send such pack files to the loader for
      origins already visited. The loader does not ask for thin pack files so it
      might be an implementation issue of git-upload-pack from their side.
      
      This is problematic as dulwich raises a KeyError exception when it finds
      a pack file has external references not resolved in it and thus the git
      loading fails.
      
      In order to workaround that issue, implement the resolve_ext_ref callback
      that can be passed as parameter to the PackInflater class. When dulwich
      encounters an external reference in the pack file, it is calling that
      callback to resolve object data. As an external object has already been
      stored into the archive, we can reconstruct its git manifest from the
      archive content and thus resolve the external reference in the pack file.
      
      Fixes #4745
      5d93f40a
    • Jérémy Bobbio (Lunar)'s avatar
  23. Feb 17, 2023
  24. Feb 06, 2023
  25. Feb 02, 2023
  26. Dec 19, 2022
  27. Dec 13, 2022
  28. Nov 04, 2022
    • Nicolas Dandrimont's avatar
      Implement discovering branch targets from the archive · 92d9ada9
      Nicolas Dandrimont authored
      With the proper implementation of packfile negotiation, remotes can
      return packfiles that do not contain all of our wanted objects. Consider
      the following two histories:
      
      * c1                                      * c1     ← [refs/tags/original]
      ↑                                         ↑ :arrow_upper_left:
      * c2 ← [refs/heads/main]                  |   * c3 ← [refs/heads/main]
                                                * c2     ← [refs/heads/broken]
      
      The first visit of the origin would load commits c1 and c2, and write a
      snapshot referencing c2.
      
      During the second visit, the loader would tell the origin that c2 is
      known, and that c1 and c3 are wanted (as new heads). The origin, knowing
      that c1 is a parent of c2, would be allowed by the git protocol to send
      a packfile containing only c3. Under these circumstances, the loader
      cannot tell what object type the snapshot branch
      [refs/tags/original] should point to.
      
      The repository in tests has a similar structure ([refs/heads/master] is
      in the history of [refs/tags/branch2-before-delete]), so refactor the
      incremental load test to exercise this specific behavior. This test can
      be moved to the common tests as well.
    • Nicolas Dandrimont's avatar
      dumb loader: also filter the symbolic refs · e7988153
      Nicolas Dandrimont authored
      Even though this is only HEAD, we should make sure that it's filtered anyway.
      e7988153
    • Nicolas Dandrimont's avatar
      Make utils.filter_refs accept {bytes: bytes} and {bytes: HexBytes} · c2ed09e0
      Nicolas Dandrimont authored
      In terms of mypy, this function is just doing some types-washing anyway.
      c2ed09e0
    • Nicolas Dandrimont's avatar
      Eagerly populate the set of local heads in RepoRepresentation.__init__ · 3cf7582a
      Nicolas Dandrimont authored
      As dulwich's client.fetch_pack expects an instance of history graph
      walker with set of known heads, move the local heads caching from
      `determine_wants` to the RepoRepresentation initialization logic.
      
      Our previous code would always initialize the graph walker with an empty
      set of heads (as the `graph_walker()` method is called before
      `determine_wants()` has run, so `self.heads` was always empty), so we
      would never actually fetch an incremental pack file.
Loading