Skip to content
Snippets Groups Projects
  1. Feb 19, 2021
    • vlorentz's avatar
      QualifiedSWHID: Replace the 'qualifiers' dict with statically defined attributes · 8e917597
      vlorentz authored
      And store their parsed values (CoreSWHID, tuple of ints, etc.) instead of string.
      8e917597
    • vlorentz's avatar
      Add new class CoreSWHID as an alternative to SWHID/QualifiedSWHID · eba8d84d
      vlorentz authored
      Following the discussion on T3034, we decided to replace SWHID with
      two or three classes:
      
      * QualifiedSWHID to replace the existing SWHID (standard types + qualifiers)
      * CoreSWHID, for "core SWHID" only (standard types + no qualifiers)
      * ExtendedSWHID for internal use in Software Heritage (extra types + no qualifiers)
      
      This commit adds the second one
      eba8d84d
    • vlorentz's avatar
      Add new class QualifiedSWHID to replace SWHID, and deprecate the latter. · 690b7f82
      vlorentz authored
      Following the discussion on T3034, we decided to replace SWHID with
      two or three classes:
      
      * QualifiedSWHID to replace the existing SWHID (standard types + qualifiers)
      * CoreSWHID, for "core SWHID" only (standard types + no qualifiers)
      * ExtendedSWHID for internal use in Software Heritage (extra types + no qualifiers)
      
      Since migrating from SWHID will break existing code, this commit uses
      the opportunity to modernize it a little, ie.:
      
      * `keyword`-only constructor, to get rid of the hacky default values for
        `object_type` and `object_id`
      * enum instead of strings for the object type
      * `bytes` instead of an hex string for the object id
      * rename `metadata` to `qualifiers`
      690b7f82
    • vlorentz's avatar
      tests: Clean hashutil._blake2_hash_cache after mocking blake2 functions. · 758eb885
      vlorentz authored
      Depending on the order in which tests are run, these tests may insert
      lambdas with mocked blake2 functions in their closure to be inserted in
      hashutil._blake2_hash_cache; causing all future tests to fail.
      
      While this does not happen with the default order of tests, it does when
      using pytest-xdist.
      758eb885
  2. Feb 02, 2021
  3. Jan 29, 2021
  4. Jan 26, 2021
  5. Jan 20, 2021
  6. Jan 13, 2021
  7. Jan 12, 2021
    • vlorentz's avatar
      test_identifiers: Reorder SWHID tests. · 1d0c3212
      vlorentz authored
      They were mixed in with snapshot tests.
      1d0c3212
    • vlorentz's avatar
      test_identifiers: Make sure that... · 731d10d3
      vlorentz authored
      test_identifiers: Make sure that {directory,revision,release,snapshot}_identifier() doesn't just return a value from the dict.
      
      For example, before this commit, you could replace the code of
      revision_identifier() with this:
      
      def release_identifier(release):
          return release.get("id", b"")
      
      and all tests would still pass.
      731d10d3
  8. Jan 04, 2021
  9. Dec 30, 2020
    • Stefano Zacchiroli's avatar
      SWHID parsing: simplify and deduplicate validation logic · 57468505
      Stefano Zacchiroli authored
      Before this change there was a lot of overlap between parse_swhid() and the
      attrs-based validators in the SWHID class. Also, the validation implementation
      in parse_swhid() was done by hand.
      
      With this change the coarse-grained validation done by parse_swhid() is now
      delegated to a regex. The semantic validation of SWHIDs is left to attrs
      validators. The regex is also exposed as a module attribute, to be used by
      client code that want to syntactically validate SWHIDs without necessarily
      instantiate SWHID classes (we have several other modules doing that already,
      and they are using slightly different hand-made regexs, which isn't great).
      
      As part of this change we also clean up the use of ValidationError exceptions,
      systematically passing the problematic parts of SWHID as arguments, and uniform
      error messages.
      
      This change also brings some speed up in SWHID parsing. On a benchmark parsing
      ~30 M valid SWHIDs, the previous implementation took ~3:06 minutes, the new one
      ~2:50 minutes, or a ~9% speedup.
      
      Closes T2788
      57468505
  10. Dec 15, 2020
    • vlorentz's avatar
      model: Make all classes slotted. · 76b744e0
      vlorentz authored
      Unfortunately, sphinx (actually, autodoc) only picks up attributes if
      they fall in any of these cases:
      
      1. are enum variants
      2. are in slots
      3. are in __dict__
      4. have an annotation
      5. are found using its custom parser
      
      (see get_object_members in sphinx/ext/autodoc/importer.py)
      
      In theory, option 5 should work for us; unfortunately, autodoc only
      asks the parser the list of members with a comment.
      And it's not easy to adapt it to ask the parser for all members,
      because said parser (sphinx/pycode/parser.py) does not return the class
      qualname (aka. namespace) for members without comments.
      
      So, as I don't want to change the interface of sphinx.pycode.parser,
      this commit switches to relying on option 3, by adding __slots__ for
      all attr classes.
      
      Additionally, this might have some performance/memory improvement
      (though I did not check) and will further avoid mutation of these
      objects.
      76b744e0
  11. Nov 16, 2020
  12. Nov 12, 2020
  13. Nov 10, 2020
  14. Oct 27, 2020
  15. Oct 26, 2020
  16. Oct 23, 2020
  17. Oct 14, 2020
    • Nicolas Dandrimont's avatar
      Make revision/release identifiers explicitly the hash of a manifest · 9224c8ca
      Nicolas Dandrimont authored
      This collapses the shared logic between these two identifier computations into a
      few more explicit steps:
       - generate data for the manifest (in either identifier computation);
       - format the manifest (in the new format_manifest function);
       - hash the manifest (in the new hash_manifest function).
      
      This will enable reusing this logic for more object types, as well as stronger
      typing for the manifest computation.
      9224c8ca
  18. Oct 08, 2020
    • vlorentz's avatar
      Add a 'unique_key' method on model objects · a251df2e
      vlorentz authored
      that returns a value suitable for unicity constraints.
      
      Motivation:
      
      * this is somewhat more of a model concern than a journal/kafka
        concern IMO
      * this is one step toward adding support for non-model objects in
        KafkaJournalWriter
      
      Implementation of the unique_key methods comes from
      `swh.journal.serializers.object_key`.
      v0.7.1
      a251df2e
  19. Oct 05, 2020
  20. Oct 02, 2020
  21. Sep 29, 2020
  22. Sep 25, 2020
  23. Sep 18, 2020
  24. Sep 17, 2020
  25. Sep 10, 2020
  26. Aug 25, 2020
  27. Aug 14, 2020
    • vlorentz's avatar
      model: Raise error on naive datetimes. · 6dd6acec
      vlorentz authored
      We may unknowingly pass naive datetimes to the storage through them,
      causing the underlying DB to assign them a timezone that might not match
      the actual one.
      
      It already happens in swh.model and swh.loader.package tests.
      6dd6acec
  28. Aug 07, 2020
Loading