Skip to content
v0.2.0 / 2021-04-17

  * athena: pass database name as an attribute
  * docs: Update for new schema
  * Add two ORC tools (orc-merge, orc-print-contents)
  * journalprocessor: only reassign partitions when needed
  * journalprocessor: disable in-partition sharding for LevelDB tests
  * ORC: export missing revision_history table
  * athena: add documentation and licensing info
  * Add athena subcommand to create/query AWS Athena database
  * Move ORC table schema in relational.py
  * test_edges: fix mypy error while mocking a method
  * Fix duplicate reference target
  * Swap README.rst and docs/README.rst to match the new template.
  * Include README.rst in the documentation.
  * Add LevelDB backend for exporter node sets
  * ORC exporter: handle releases with empty authors/dates
  * Update exporters.edged to swh.model 1.0
  * ORC exporter: avoid fromtimestamp(), use datetime() from epoch instead
  * Refactor export paths in the base Exporter class
  * ORC exporter: Add unit tests
  * Add ORC exporter
  * Edge exporter: use common remove_pull_requests() function
  * journalprocessor: be resilient to exporter errors
  * Export CLI: add a way to exclude specific object types
  * Namespace exporters in exporters/ dir
  * journalprocessor: don't shadow the object function
  * journalprocessor: fix hashing of origin_visit_status objects
  * journalprocessor: remove comment about deserialize_message overload being a 'hack'
  * tests: fix test_export_origin
  * SQLite on-disk set: disable journalling and synchronous mode
  * journalprocessor: also partition sqlite files by first byte
  * Journal processor: fetch offsets in parallel
  * Exporter documentation fixes
  * Rewrite of the export pipeline using Exporters
  * Graph export: add labels to the export CSV format
  * graph exporter: schema upgrade for origin_visit_status
  * Replace vcversioner with setuptools-scm
  * Run isort after the CLI import changes