Skip to content
Snippets Groups Projects
  1. Mar 18, 2021
  2. Mar 12, 2021
  3. Mar 08, 2021
  4. Mar 04, 2021
  5. Mar 01, 2021
  6. Dec 30, 2020
    • Stefano Zacchiroli's avatar
      SWHID parsing: simplify and deduplicate validation logic · 57468505
      Stefano Zacchiroli authored
      Before this change there was a lot of overlap between parse_swhid() and the
      attrs-based validators in the SWHID class. Also, the validation implementation
      in parse_swhid() was done by hand.
      
      With this change the coarse-grained validation done by parse_swhid() is now
      delegated to a regex. The semantic validation of SWHIDs is left to attrs
      validators. The regex is also exposed as a module attribute, to be used by
      client code that want to syntactically validate SWHIDs without necessarily
      instantiate SWHID classes (we have several other modules doing that already,
      and they are using slightly different hand-made regexs, which isn't great).
      
      As part of this change we also clean up the use of ValidationError exceptions,
      systematically passing the problematic parts of SWHID as arguments, and uniform
      error messages.
      
      This change also brings some speed up in SWHID parsing. On a benchmark parsing
      ~30 M valid SWHIDs, the previous implementation took ~3:06 minutes, the new one
      ~2:50 minutes, or a ~9% speedup.
      
      Closes T2788
      57468505
  7. Nov 16, 2020
  8. Oct 26, 2020
  9. Oct 08, 2020
    • vlorentz's avatar
      Add a 'unique_key' method on model objects · a251df2e
      vlorentz authored
      that returns a value suitable for unicity constraints.
      
      Motivation:
      
      * this is somewhat more of a model concern than a journal/kafka
        concern IMO
      * this is one step toward adding support for non-model objects in
        KafkaJournalWriter
      
      Implementation of the unique_key methods comes from
      `swh.journal.serializers.object_key`.
      v0.7.1
      a251df2e
  10. Sep 17, 2020
  11. Aug 14, 2020
    • vlorentz's avatar
      model: Raise error on naive datetimes. · 6dd6acec
      vlorentz authored
      We may unknowingly pass naive datetimes to the storage through them,
      causing the underlying DB to assign them a timezone that might not match
      the actual one.
      
      It already happens in swh.model and swh.loader.package tests.
      6dd6acec
  12. Jul 29, 2020
  13. Jul 07, 2020
  14. Jul 06, 2020
    • David Douard's avatar
      Extract the extra_headers from metadata on the Revision model class · a7d9aca2
      David Douard authored
      Add a new extra_headers attribute on Revision and use it for computing
      the revision's id instead of extract it from the metadata field.
      
      Only accept (bytes, bytes) as extra_header.
      
      Add a post init hook to Revision to initialize this new attribute from
      given metadata, if any, for bw compat.
      
      Also amend the revision_d hyptothesis strategy to generate extra_headers.
      v0.4.0
      a7d9aca2
  15. Jun 24, 2020
  16. May 20, 2020
    • David Douard's avatar
      Add support for model object anonymization · 29312dff
      David Douard authored
      Simply add a BaseModel.anonymize() method. Default implementation returns
      None, meaning the object is not anonymizable.
      
      For Person, the method returns a Person whith hashed fullname (and unset
      name and email).
      
      For Revision and Release, the method returns an anonymized version of
      the object, i.e. with instance of Person replaced by anonymized ones.
      v0.2.0
      29312dff
  17. Apr 10, 2020
  18. Apr 08, 2020
    • David Douard's avatar
      Enable black · bf3f1cec
      David Douard authored
      - blackify all the python files,
      - enable black in pre-commit,
      - add a black tox environment.
      bf3f1cec
  19. Apr 01, 2020
  20. Mar 31, 2020
  21. Mar 11, 2020
  22. Mar 04, 2020
  23. Mar 02, 2020
  24. Feb 27, 2020
  25. Feb 24, 2020
    • vlorentz's avatar
      Add to_model() method to from_disk.{Content,Directory}, to convert to canonical model objects. · 6da524cb
      vlorentz authored
      They will be used by loaders, so they can deal only with
      model objects, instead of having to do the same conversion themselves.
      
      This removes the `data` and `save_path` arguments of `from_file` and
      `from_disk`, as data loading is always deferred from now on.
      To access it, users are now expected to either open the data files
      themselves, or us `.to_model().with_data()`.
      6da524cb
  26. Feb 14, 2020
  27. Jan 30, 2020
  28. Nov 29, 2019
  29. Oct 30, 2019
  30. Oct 29, 2019
Loading