Add patching proxy
Merge request reports
Activity
Looks like the proxy's
__getattr__
is completely unused.I'm not fully convinced by the tradeoffs for introducing a whole new proxy; I see a few pros and cons:
pros:
- (I think it's a big one) cleaner separation of concerns
cons:
- if we fully separate the two result-altering proxies, multiplication of the result-altering databases
- stacked overheads for method accesses when chaining the proxies (probably mitigated by the setattr workaround)
- Double the number of tests (at least for the "no altering" codepaths)
- I don't see places where we would want to use one without the other
With !1113 (merged) maybe this should all end up living in or alongside swh.alter as "the swh.alter storage proxy"?
Edited by Nicolas DandrimontLooks like the proxy's
__getattr__
is completely unused.What do you mean? It is used for all methods not acting on revisions or releases.
multiplication of the result-altering databases
I assumed we would end up with both proxies in the same package, so they can safely use the same database.
With !1113 (merged) maybe this should all end up living in or alongside swh.alter as "the swh.alter storage proxy"?
👍 that would solve the DB question above.stacked overheads for method accesses when chaining the proxies (probably mitigated by the setattr workaround)
Even assuming the backend requests take 1ms (I can't find the data on Grafana, but IIRC most requests are in the 100ms range), the overhead is less than 1% once setattr is done:
In [1]: import time In [2]: class Backend: ...: def f(self, arg1, arg2): ...: time.sleep(0.001) ...: In [3]: class Proxy: ...: def f(self, arg1, arg2): ...: self.storage.f(arg1, arg2) ...: In [4]: backend = Backend() In [5]: proxy = Proxy() In [6]: proxy.storage = backend In [7]: %timeit backend.f(123, 456) 1.16 ms ± 2.07 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) In [8]: %timeit proxy.f(123, 456) 1.16 ms ± 4.85 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) In [9]: %timeit backend.f(123, 456) 1.16 ms ± 4.74 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) In [10]: %timeit proxy.f(123, 456) 1.16 ms ± 3.58 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Double the number of tests (at least for the "no altering" codepaths)
Yeah that's a bit of an issue because it adds 10s; but I think it will be acceptable once outside swh-storage and its super-long test suite.
I don't see places where we would want to use one without the other
I don't think that's a big deal compared to the separation of concerns. We could make the same argument about the buffering and filtering proxies.
Edited by vlorentzLooks like the proxy's getattr is completely unused.
What do you mean? It is used for all methods not acting on revisions or releases.
Sorry, I meant that
_get_method
always returnsgetattr(self.storage, method_name)
Edited by Nicolas Dandrimontmultiplication of the result-altering databases
I assumed we would end up with both proxies in the same package, so they can safely use the same database.
With !1113 (merged) maybe this should all end up living in or alongside swh.alter as "the swh.alter storage proxy"?
👍 that would solve the DB question above.Meh; that's still two connection pools to manage, double the number of connections to establish to the same database, duplicate configs, etc. As I said, I'm not convinced the separation of concerns trumps all of this overhead.
Sorry, I meant that
_get_method
always returnsgetattr(self.storage, method_name)
or raises
NotImplementedError
. It's meant as an allow-list so we don't accidentally forget to implement new methods on revisions and releases.Meh; that's still two connection pools to manage, double the number of connections to establish to the same database, duplicate configs, etc. As I said, I'm not convinced the separation of concerns trumps all of this overhead.
Hmm... I don't know
Done in efb60ed8.
Jenkins job DSTO/gitlab-builds #532 failed .
See Console Output and Coverage Report for more details.mentioned in issue #4707 (closed)
mentioned in merge request !1118 (closed)
Jenkins job DSTO/gitlab-builds #543 succeeded .
See Console Output and Coverage Report for more details.Jenkins job DSTO/gitlab-builds #544 succeeded .
See Console Output and Coverage Report for more details.added 27 commits
-
efb60ed8...3448fc14 - 24 commits from branch
master
- 8f535d84 - Introduce object patching proxy
- 952421c7 - Fix docstring
- a757063c - Make allowlist in get_method more compact
Toggle commit list-
efb60ed8...3448fc14 - 24 commits from branch
Jenkins job DSTO/gitlab-builds #550 succeeded .
See Console Output and Coverage Report for more details.As requested by @olasd, I merged the patching proxy this MR introduced into the existing masking proxy.
I kept the MaskingProxy/MaskingAdmin/MaskingQuery terminology, but we may want to bikeshed these.
Thanks.
As mentioned on IRC (and as expected, I guess ;-)) the separated code was cleaner, so it probably makes sense to retain it.
A couple alternatives were mentioned:
- making one proxy inherit the other and have the shorter one use
super()
- sharing the underlying masking connection pool by adding a bool argument to
__init__
or special-casingmasking_db="shared"
(and usingself._masking_pool = self.storage._masking_pool
)
Either option would keep the separation of concerns and allow DRYing the database/connection management, which makes me happier :-)
(the pool sharing is a bit of a hack but it probably the option that better keeps the concerns separate)
Edited by Nicolas Dandrimont- making one proxy inherit the other and have the shorter one use
Jenkins job DSTO/gitlab-builds #556 succeeded .
See Console Output and Coverage Report for more details.- Resolved by vlorentz
- Resolved by vlorentz