Most indexers are consuming journal topics slower than messages are produced
Grafana dashboards show that most indexer consumers are working, but they are lagging behind, and the gap is increasing.
https://grafana.softwareheritage.org/goto/RJkpEXVVz?orgId=1 origin-intrinsic-metadata (on
https://grafana.softwareheritage.org/goto/c_iYPX4Vz?orgId=1 content-fossology-license (on
https://grafana.softwareheritage.org/goto/g1VPPX4Vz?orgId=1 content-mimetype (on
- however, the extrinsic-metadata indexer is fine: https://grafana.softwareheritage.org/goto/3YQsPX44z?orgId=1
The plot of lag derivative shows a progressive slowdown, so it's probably not due to a specific configuration change.
I do not know what is causing this, though. Two possible suspects:
- rdkafka frequently disconnecting from the brokers (or generally, having connection isues): https://sentry.softwareheritage.org/share/issue/76ed328b2ae6465face2ea4bb5f32187/
- slow storage and/or objstorage (which would make sense, as the extrinsic-metadata indexer is super-fast, and is also the only one not to use the storage and objstorage)
- simply not having enough workers
Migrated from T4612 (view on Phabricator)