Skip to content
Snippets Groups Projects

Draft: Rewrite the Provenance service as a gRPC server in Rust, backed by Parquet (featuring Datafusion joins)

Closed vlorentz requested to merge riir-batch into main

This is !182 (merged) plus one commit that is supposed to add initial support for batch queries

Unfortunately, Datafusion does not seem to use pushdown when joining, which makes low-cardinality join very inefficient. Here, joining c_in_r with a one-row table takes 1.5TB for minutes (hours?) on the prod table, while a 'WHERE cnt=' clause
was under 0.2s."

Merge request reports

Pipeline #11255 failed

Pipeline failed for 859a7bf5 on riir-batch

Approval is optional

Closed by vlorentzvlorentz 3 months ago (Nov 4, 2024 3:33pm UTC)

Merge details

  • The changes were not merged into main.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading