Winery settings refactorings
Compare changes
Some changes are not shown
For a faster browsing experience, some files are collapsed by default.
Files
42+ 132
− 17
@@ -17,33 +17,148 @@ If the current accumulated bandwidth is above the maximum desired speed for N ac
The `sharedstorage.py` file contains the global index implementation that associates every object id to the shard it contains. A list of shard (either writable or readonly) is stored in a table, with a numeric id to save space. The name of the shard is used to create a database (for write shards) or a RBD image (for read shards).
:py:mod:`swh.objstorage.backends.winery.sharedbase` contains the global objstorage index implementation, which associates every object id (currently, the SHA256 of the content) to the shard it contains. The list of shards is stored in a table, associating them with a numeric id to save space, and their current :py:class:`swh.objstorage.backends.winery.sharedbase.ShardState`. The name of the shard is used to create a table (for write shards) or a RBD image (for read shards).
:py:mod:`swh.objstorage.backends.winery.roshard` handles read-only shard management: classes handling the lifecycle of the shards pool, the :py:class:`swh.objstorage.backends.winery.roshard.ROShardCreator`, as well as :py:class:`swh.objstorage.backends.winery.roshard.ROShard`, a thin layer on top of :py:mod:`swh.perfecthash` used to access the objects stored inside a read-only shard.
The `obstorage.py` file contains the backend implementation in the `WineryObjStorage` class. It is a thin layer that delegates writes to a `WineryWriter` instance and reads to a `WineryReader` instance. Although they are currently tightly coupled, they are likely to eventually run in different workers if performance and security requires it.
:py:class:`swh.objstorage.backends.winery.objstorage.WineryObjStorage` is the main entry point compatible with the :py:mod:`swh.objstorage` interface. It is a thin layer backed by a :py:class:`swh.objstorage.backends.winery.objstorage.WineryWriter` for writes, and a :py:class:`swh.objstorage.backends.winery.objstorage.WineryReader` for read-only accesses.
A `WineryReader` must be able to read an object from both Read Shards and Write Shards. It will first determine the kind of shard the object belongs to by looking it up in the global index. If it is a Read Shard, it will lookup the object using the `ROShard` class from `roshard.py`, ultimately using a Ceph RBD image. If it is a Write Shard, it will lookup the object using the `RWShard` class from `rwshard.py`, ultimately using a PostgreSQL database.
:py:class:`swh.objstorage.backends.winery.objstorage.WineryReader` performs read-only actions on both read-only shards and write shards. It will first determine the kind of shard the object belongs to by looking it up in the global index. If it is a read-only Shard, it will lookup the object using :py:class:`swh.objstorage.backends.winery.roshard.ROShard`, backed by the RBD or directory-based shards pool. If it is a write shard, it will lookup the object using the :py:class:`swh.objstorage.backends.winery.rwshard.RWShard`, ultimately using a PostgreSQL table.
and it will lock the write Shard and own it so no other instance tries to write to it. Locking is done transactionally by setting a locker id in the shards index, when the :py:class:`swh.objstorage.backends.winery.objstorage.WineryWriter` process dies unexpectedly, these entries need to be manually cleaned up.
When a new object is added to the Write Shard, a new row is added to the global index to record that it is owned by this Write Shard and is in flight. Such an object cannot be read because it is not yet complete. If a request to write the same object is sent to another `WineryWriter` instance, it will fail to add it to the global index because it already exists. Since the object is in flight, the `WineryWriter` will check if the shard associated to the object is:
When the cumulative size of all objects within a Write Shard exceeds a threshold, it is set to be in the `full` state. All objects it contains can be read from it by any :py:class:`swh.objstorage.backends.winery.objstorage.WineryReader` but no new object will be added to it. When `pack_immediately` is set, a process is spawned and is tasked to transform the `full` shard into a Read Shard using the :py:class:`swh.objstorage.backends.winery.objstorage.Packer` class. Should the packing process fail for any reason, a cron job will restart it when it finds Write Shards that are both in the `packing` state and not locked by any process. Packing is done by enumerating all the records from the Write Shard database and writing them into a Read Shard by the same name. Incomplete Read Shards will never be used by :py:class:`swh.objstorage.backends.winery.objstorage.WineryReader` because the global index will direct it to use the Write Shard instead. Once the packing completes, the state of the shard is modified to be `packed`, and from that point on the :py:class:`swh.objstorage.backends.winery.objstorage.WineryReader` will only use the Read Shard to find the objects it contains. If `clean_immediately` is set, the table containing the Write Shard is then destroyed because it is no longer useful and the process terminates on success.
When the size of the database associated with a Write Shard exceeds a threshold, it is set to be in the `packing` state. All objects it contains can be read from it by any `WineryReader` but no new object will be added to it. A process is spawned and is tasked to transform it into a Read Shard using the `Packer` class. Should it fail for any reason, a cron job will restart it when it finds Write Shards that are both in the `packing` state and not locked by any process. Packing is done by enumerating all the records from the Write Shard database and writing them into a Read Shard by the same name. Incomplete Read Shards will never be used by `WineryReader` because the global index will direct it to use the Write Shard instead. Once the packing completes, the state of the shard is modified to be readonly and from that point on the `WineryReader` will only use the Read Shard to find the objects it contains. The database containing the Write Shard is then destroyed because it is no longer useful and the process terminates on success.
* the RW shard cleaner `swh objstorage winery rw-shard-cleaner` performs clean up of the `packed` read-write shards, as soon as they are recorded as mapped on enough (`--min-mapped-hosts`) hosts. They get locked in the `cleaning` state, the database cleanup is performed, then the shard gets moved in the final `readonly` state.