Add support for removing RawExtrinsicMetadata and ExtID
While we were able to backup, restore and remove most object types already, we were missing support for two other object types present in our storage: RawExtrinsicMetadata and ExtID.
Both are primarily accessed not through their own identifier but by the identifiers of their target: the object they are associated with. When we remove an object, we want to remove all RawExtrinsicMetadata or ExtID associated with them.
While they share this common property, they are handled slightly differently:
RawExtrinsicMetadata can be addressed themselves using Extended SWHIDs.
swh.alter.inventory.get_raw_extrinsic_metadata()
will take a list of
targets and return a list of SWHIDS for the associated
RawExtrinsicMetadata. It makes sure to recursively add any
RawExtrinsicMetadata referencing a relevant RawExtrinsicMetadata.
This list of RawExtrinsicMetadata SWHIDs can thus be added to the
list of objects to be removed in Remover.get_removable()
. The
ObjectionDeletionInterface.object_delete()
method of swh-storage
will then take care of removing RawExtrinsicMetadata objects.
ExtID are not directly addressable. We thus handle them a bit like
OriginVisit and OriginVisitStatus objects: we find them and add them to
the recovery bundle while adding their target. They are deleted using
ObjectDeletionInterface.extid_delete_for_target()
after we have
deleted the targets.
In both cases, we have to accept that there is the possibility that new RawExtrinsicMetadata or ExtID objects are added in between the listing and their deletion. In the case of RawExtrinsicMetadata, these objects would still be present in the storage, while hard to reach. A scrubber job could look for these and remove them. For ExtID however, they will be entirely lost: some information will be missing from the archive in case of a recovery.
Adding RawExtrinsicMetadata and ExtID objects to the recovery bundle requires to bump the format version. The tests are updated to make sure that we can still restore older bundles.
Removing and restoring RawExtrinsicMetadata objects raise questions on how to handle MetatadaAuthority and MetadataFetcher objects. Currently the latter are not removed by swh-alter. This means that we can assume that the required objects will be present in the storage when adding RawExtrinsicMetadata objects from a recovery bundle. But this means we might create dangling MetadataAuthority and MetadataFetcher objects. The issue is tracked as #21
Closes #11 (closed)