multiplexer: Check backends with respect to their read-only nature

Jenkins job DOBJS/gitlab-builds #252 failed .
See Console Output and Coverage Report for more details.

question: shouldn't the flag actually be read_only rather than check_write?

I don't really understand the business logic behind storing a check_write flag. What really is its semantic?

I get it solves the issue we are facing, but I'm not convinced it's the right answer (albeit both cases look very close, they have different meaning). In this regard, read_only is a simple and easy to understand config entry.

(But then you have(?) to implement it for all backends, or at least raise a configuration error if need be or someting)

question: shouldn't the flag actually be read_only rather than check_write?

Yes, I asked myself that very question but kept the simple implem. It'd be more readable.

I also asked myself if then the check_write parameter of the method check_config is still relevant... (but that impacts the interface so i kept it).

I get it solves the issue we are facing, but I'm not convinced it's the right answer (albeit both cases look very close, they have different meaning).

I believe the 2 cases you mention are:

is the objstorage read-only?
Do the check_config method check whether the objstorage is writable?

Afaict, we did not differentiate so far.

In this regard, read_only is a simple and easy to understand config entry.

yes, i agree.

(But then you have(?) to implement it for all backends, or at least raise a configuration error if need be or someting)

I don't see what I would need to adapt in the other backends, care to explicit please?

In the end, it's not needed at all. It's only a log that is (erroneously?) reported to sentry [1]

[1]

11:08:41 <ardumont> i've noticed another error too anterior to the last deployment in the objstorage too
11:09:11 <ardumont> i've opened https://gitlab.softwareheritage.org/swh/devel/swh-objstorage/-/issues/4746 and fixed it in https://gitlab.softwareheritage.org/swh/devel/swh-objstorage/-/merge_requests/195
11:09:26 <ardumont> (still fighting the unclear test_types test about it but the gist of it should fix it)
11:10:04 <ardumont> it's less impactful than the other azure objstorage issue as it only fails the read-only multiplexer but still something
11:21:45 <olasd> it doesn't fail. the exception gets reported to sentry but it's handled
11:24:16 <ardumont> ok
11:24:20 <ardumont> it's confusing though

closed

mentioned in issue swh/infra/sysadm-environment#5337 (closed)

Exceptions that are expected in check_config(check_write=True) for read-only objstorages should already be:

supported for serialization and deserialization by the RPC
allowed explicitly in the multiplexer __init__ code.

For the specific case of the PermissionError, I believe that the sentry reports are due to it being raised before the new version of the swh.objstorage RPC server was deployed. It's handled by the RPC with a 403 error, which quiesces sentry reports.

Overall, I think we should move towards changing check_config to return a dict with a handful of boolean keys (I can think of reads_allowed, writes_allowed, deletes_allowed, and maybe listing_allowed for a start)

The multiplexer could then be configured to make sure that the config of the successive backends matches what we expect. In the current situation, if the multiplexer azure backend were to be initialized read-only due to a configuration error, it would be silently ignored for writes (I believe the previous implementation would have failed to commit the first object written)

multiplexer: Check backends with respect to their read-only nature

Merge request reports

Activity