Skip to content

Draft: Rewrite the Provenance service as a gRPC server in Rust, backed by Parquet (featuring Datafusion joins)

vlorentz requested to merge riir-batch into main

This is !182 (merged) plus one commit that is supposed to add initial support for batch queries

Unfortunately, Datafusion does not seem to use pushdown when joining, which makes low-cardinality join very inefficient. Here, joining c_in_r with a one-row table takes 1.5TB for minutes (hours?) on the prod table, while a 'WHERE cnt=' clause
was under 0.2s."

Merge request reports

Loading