Skip to content
Snippets Groups Projects
  1. Dec 20, 2022
    • David Douard's avatar
      Add a backfiller cli command · e66a2bf9
      David Douard authored and Nicolas Dandrimont's avatar Nicolas Dandrimont committed
      This command allowd to backfill a kafka journal from an existing
      Postgresql provenance storage.
      
      The command will run a given number of workers in parallel. The state of
      the backfilling process is saved in a leveldb store, so interrupting and
      restarting a backfilling process is possible, with limitations: it won't
      work properly if the range generation is modified.
      e66a2bf9
  2. Dec 09, 2022
  3. Nov 29, 2022
  4. Nov 23, 2022
  5. Nov 02, 2022
  6. Oct 18, 2022
  7. Oct 13, 2022
  8. Oct 12, 2022
  9. Oct 11, 2022
    • David Douard's avatar
      Add support for kafka journalization of the ProvenanceStorageInterface · 08f2e604
      David Douard authored
      the new ProvenanceStorageJournal is a proxy ProvenanceStorageInterface
      that will push added objects in a swh-journal (typ. a kafka).
      
      Journal messages are simple dicts with 2 keys: id (the sharding key) and
      value (a serialiazable version of the argument of the xxx_add() method).
      
      Use the 'kafka' pytest marker for all kafka-related tests (especially
      used for tox, see tox.ini).
      08f2e604
    • David Douard's avatar
      Rename ProvenanceInterface.directory_xxx_flattenned as directory_xxx_flattened · 7e6a62c9
      David Douard authored
      and fix all occurrences of the typo.
      7e6a62c9
    • David Douard's avatar
      Normalize _add() methods of the ProvenanceStorage interface · 2bd74fc7
      David Douard authored
      make them all accept a Dict[Sha1Git, xxx] as argument, ie:
      
      - remove support for Iterable[bytes] in revision_add, and
      - replace Iterable[bytes] by Dict[Sha1Git, bytes] for location_add
      
      Currently, the sha1 of location path in location_add() is not really
      used by any backend, so the computation of said hashed is a waste of
      resource, but it makes the API of this interface much more consistent
      which will be helpful for coming features (like kafka journal).
      2bd74fc7
  10. Oct 03, 2022
  11. Sep 08, 2022
  12. Sep 01, 2022
    • David Douard's avatar
      swhgraph: use grpc API · fa15961c
      David Douard authored
      replace the (deprecated) HTTP RCP API to access the swh-graph service,
      in favor of the grpc server.
      
      To be able to test the (now) grpc-based ArchiveGraph, compressed graph
      datasets for all 3 common datasets (cmdbts2, out-of-order and with-merges)
      have been generated and included in this revision.
      fa15961c
    • David Douard's avatar
      tests: add tags (Release) in the datasets · 865449ec
      David Douard authored
      this will be needed for testing grpc swh-graph archive backend.
      865449ec
  13. Aug 30, 2022
  14. Aug 12, 2022
Loading