multiplexer: Check backends with respect to their read-only nature
Prior to this commit, a multiplexer instance would start during its initialization to check the configuration of all its backends but also check whether they can write or not even when not needed. As it exists read-only backends for a multiplexer, that would just fail to start the service.
This evolves the objstorage to grow a check_write attribute (hence parametric through configuration file). And then adapts accordingly the multiplexer constructor so it can check the configuration of its backends according to their respective configuration.
Refs. #4746 (closed) Refs. swh/infra/sysadm-environment#5334 (closed)
Merge request reports
Activity
Jenkins job DOBJS/gitlab-builds #252 failed .
See Console Output and Coverage Report for more details.question: shouldn't the flag actually be
read_only
rather thancheck_write
?I don't really understand the business logic behind storing a
check_write
flag. What really is its semantic?I get it solves the issue we are facing, but I'm not convinced it's the right answer (albeit both cases look very close, they have different meaning). In this regard,
read_only
is a simple and easy to understand config entry.(But then you have(?) to implement it for all backends, or at least raise a configuration error if need be or someting)
question: shouldn't the flag actually be read_only rather than check_write?
Yes, I asked myself that very question but kept the simple implem. It'd be more readable.
I also asked myself if then the check_write parameter of the method check_config is still relevant... (but that impacts the interface so i kept it).
I get it solves the issue we are facing, but I'm not convinced it's the right answer (albeit both cases look very close, they have different meaning).
I believe the 2 cases you mention are:
- is the objstorage read-only?
- Do the check_config method check whether the objstorage is writable?
Afaict, we did not differentiate so far.
In this regard, read_only is a simple and easy to understand config entry.
yes, i agree.
(But then you have(?) to implement it for all backends, or at least raise a configuration error if need be or someting)
I don't see what I would need to adapt in the other backends, care to explicit please?
In the end, it's not needed at all. It's only a log that is (erroneously?) reported to sentry [1]
[1]
11:08:41 <ardumont> i've noticed another error too anterior to the last deployment in the objstorage too 11:09:11 <ardumont> i've opened https://gitlab.softwareheritage.org/swh/devel/swh-objstorage/-/issues/4746 and fixed it in https://gitlab.softwareheritage.org/swh/devel/swh-objstorage/-/merge_requests/195 11:09:26 <ardumont> (still fighting the unclear test_types test about it but the gist of it should fix it) 11:10:04 <ardumont> it's less impactful than the other azure objstorage issue as it only fails the read-only multiplexer but still something 11:21:45 <olasd> it doesn't fail. the exception gets reported to sentry but it's handled 11:24:16 <ardumont> ok 11:24:20 <ardumont> it's confusing though
mentioned in issue swh/infra/sysadm-environment#5337 (closed)
Exceptions that are expected in
check_config(check_write=True)
for read-only objstorages should already be:- supported for serialization and deserialization by the RPC
- allowed explicitly in the multiplexer
__init__
code.
For the specific case of the
PermissionError
, I believe that the sentry reports are due to it being raised before the new version of the swh.objstorage RPC server was deployed. It's handled by the RPC with a 403 error, which quiesces sentry reports.Overall, I think we should move towards changing check_config to return a dict with a handful of boolean keys (I can think of
reads_allowed
,writes_allowed
,deletes_allowed
, and maybelisting_allowed
for a start)Edited by Nicolas DandrimontThe multiplexer could then be configured to make sure that the config of the successive backends matches what we expect. In the current situation, if the multiplexer azure backend were to be initialized read-only due to a configuration error, it would be silently ignored for writes (I believe the previous implementation would have failed to commit the first object written)