Skip to content

Gaph and content data export for SWH

Why this MR

Enables the deduplicator to prepare the data (all graph nodes, including content nodes with source code data) to be exported in orc format based on the following specification

How to test

Start the ingestion pipeline with winery, update the configuration file and launch the start_deduplicator.py script

Closes #52 and #63.

Edited by Simeon Carstens

Merge request reports

Loading