Implement Shard.delete()
In order to be able to remove objects from objstorage–in the case of
takedown notices–we add a new Shard.delete()
method.
As Shard files uses a perfect hash function computed on creation, and fixed offsets, completely removing an object would amount to recreate a new Shard from scratch. As these files are meant to be quite large and removals should be rare, we just overwrite the object size and data with zeros.
The object position in the hash table is also replaced with UINT64_MAX
in order to signal that the object has been removed. Shard.lookup()
has been updated accordingly and will throw a KeyError
if the object
matching a key has been deleted.
The interface is not ideal but it is due to a more general problem of
the design of API. The caller must be careful not to run delete()
on a
“created” or “loaded” Shard as the method will take care of opening the
Shard file in read/write mode, overwrite the right bytes and close the
file again.
Related to swh-alter#4 (closed) (for Winery)
Merge request reports
Activity
changed milestone to %Tooling for takedown notices [Roadmap - Collect]
assigned to @lunar
Jenkins job DOPH/gitlab-builds #16 succeeded .
See Console Output and Coverage Report for more details.@jenkins retry build
Jenkins job DOPH/gitlab-builds #26 succeeded .
See Console Output and Coverage Report for more details.added 23 commits
-
11e3872e...fb2d4bb8 - 20 commits from branch
swh/devel:master
- 2e63afd2 - Prevent double-free in shard_close()
- 1661c9dc - Refactor tests so Shard creation is in a dedicated fixture
- 22ec47de - Implement Shard.delete()
Toggle commit list-
11e3872e...fb2d4bb8 - 20 commits from branch
Jenkins job DOPH/gitlab-builds #47 succeeded .
See Console Output and Coverage Report for more details.Jenkins job DOPH/gitlab-builds #48 succeeded .
See Console Output and Coverage Report for more details.With the new API neatly separating shard creation from usage, it feels a bit inconsistent to have
delete
as a staticmethod of the (used-to-be readonly)Shard
instead of having a separate helper class with access to the delete method.We'll probably also want to be able to delete multiple objects from the shard, opening and closing it only once. What do you think?
Yeah, I am a bit lost on the right way to do this. Using a static method felt the easiest semantically because deleting is about taking an existing shard (so using the
Shard
class instead ofShardCreator
felt better) but changing it. Havingdelete
as an instance method ofShard
felt wrong, as to do the change we need to open the file inr+
mode. My impression was that deletion was infrequent enough that we could just take the extra cost of opening and closing the file multiple times for each object… I weighted for code simplicity and interface safety, but if you have another idea I'll gladly give it a shot.Oh, also
ObjStorageInterface.delete
takes a singleobj_id
(source)mentioned in merge request swh-objstorage!166 (merged)