Publish scrubber metrics and create grafana dashboard
number of checked_range
s (and ratio that was checked).
oldest checked_range
number of missing_object
Migrated from T4684 (view on Phabricator)
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- vlorentz added Datastore Scrubber priority:High labels
added Datastore Scrubber priority:High labels
- vlorentz changed title from Create grafana dashboard for scrubber metrics to Create scrubber metrics and grafana dashboard
changed title from Create grafana dashboard for scrubber metrics to Create scrubber metrics and grafana dashboard
- vlorentz changed title from Create scrubber metrics and grafana dashboard to Publish scrubber metrics and create grafana dashboard
changed title from Create scrubber metrics and grafana dashboard to Publish scrubber metrics and create grafana dashboard
- vlorentz changed milestone to %Regularly scrub journal, storage, and objstorage [Roadmap - Preserve]
changed milestone to %Regularly scrub journal, storage, and objstorage [Roadmap - Preserve]
- vlorentz added activity::Epic label
added activity::Epic label
- vlorentz mentioned in issue #4685 (closed)
mentioned in issue #4685 (closed)
Collapse replies
untested config:
- name: scrubber_checked scope: database database: ^(swh|softwareheritage)-scrubber$ interval: '1h' help: "Software Heritage Scheduler scrubber coverage" query: | WITH bucket AS ( SELECT distinct datastore, object_type, nb_partitions FROM checked_partition ) SELECT datastore.package AS datastore_package, datastore.class AS datastore_class, datastore.instance AS datastore_instance, bucket.object_type AS object_type, bucket.nb_partitions AS nb_partitions, COUNT(*) AS checked_partitions_total, MIN(last_date) AS oldest_check_date, NOW() - MIN(last_date) AS oldest_check_age_seconds FROM checked_partition INNER JOIN datastore ON (datastore.id=bucket.datastore) WHERE ( checked_partition.datastore=bucket.datastore AND checked_partition.object_type=bucket.object_type AND checked_partition.nb_partitions=bucket.nb_partitions ) GROUP BY bucket.datastore, bucket.object_type, bucket.nb_partitions labels: - datastore_package - datastore_class - datastore_instance - object_type - nb_partitions values: - checked_partitions_total - oldest_check_date - oldest_check_age_seconds
I'll try it when the new schema is applied
- Maintainer
Updated (tested) query:
SELECT datastore.package AS datastore_package, datastore.class AS datastore_class, datastore.instance AS datastore_instance, checked_partition.object_type AS object_type, checked_partition.nb_partitions AS nb_partitions, COUNT(*) AS checked_partitions_total, EXTRACT (EPOCH FROM MIN(last_date)) AS oldest_check_epoch, EXTRACT (EPOCH FROM MAX(last_date)) AS newest_check_epoch FROM checked_partition INNER JOIN datastore ON (datastore.id=checked_partition.datastore) GROUP BY 1,2,3,4,5;
Diff: drop the CTE, use proper fields for group by, extracting epochs instead of returning timestamptzs
The oldest_check_age_seconds computation can be done by prometheus
Edited by Nicolas Dandrimont - Nicolas Dandrimont mentioned in commit swh/infra/puppet/puppet-swh-site@ded2e4e9
mentioned in commit swh/infra/puppet/puppet-swh-site@ded2e4e9
- Nicolas Dandrimont mentioned in merge request swh/infra/puppet/puppet-swh-site!614 (merged)
mentioned in merge request swh/infra/puppet/puppet-swh-site!614 (merged)
- Maintainer
added to a nice dashboard: https://grafana.softwareheritage.org/goto/hTa7hwBVk?orgId=1
1- vlorentz closed
closed