Store original git manifests
For "weird" objects, alongside the rest of the row.
To do:
- add a new
raw_manifest
field to Directory and Revision objects (and Release, for future-proofing), with typeOptional[bytes]
. it should contain the type and size (ie. the complete git header) - add a check method to model objects; should check the id matches (
self.compute_hash() == self.id
), but also that ifraw_manifest is not None
, then then it must differ from the manifest we would compute (ie. there shouldn't be a useless value inraw_manifest
) - Add a column in postgres, defaults to NULL and write it
- Monitor the number of objects with a non-NULL raw_manifest, and warn if it raises too fast (it probably means there is a bug in a loader) -> swh-counters
- Figure a way to report issues from directly from the git loader? (eg. make the git-loader raise an issue in sentry if too many objects in the same repo have a
raw_manifest
)
Make the vault use it when available, somewhere after 3
Migrated from T3753 (view on Phabricator)
Edited by Phabricator Migration user