model: optimize dictify()
- reorder conditionals to minimize the number of tests needed
- use hasattr() instead of the expensive isinstance()
Motivation: Ironically, this is the bottleneck of my checksumming script.
Benchmark:
In [1]: from swh.storage import get_storage
In [2]: from swh.model.identifiers import CoreSWHID, _BaseSWHID
In [3]: s = get_storage('remote', url='http://moma.internal.softwareheritage.org:5002/')
In [4]: rev = s.revision_get([bytes.fromhex("747675816d815e86b7482b5a0acb9110eeeec590")])[0]
Before this commit:
In [18]: %timeit rev.to_dict()
10000 loops, best of 5: 70.4 µs per loop
In [19]: %timeit rev.to_dict()
10000 loops, best of 5: 69.3 µs per loop
After this commit:
In [5]: %timeit rev.to_dict()
10000 loops, best of 5: 48.4 µs per loop
In [6]: %timeit rev.to_dict()
10000 loops, best of 5: 45.7 µs per loop
In [7]: %timeit rev.to_dict()
10000 loops, best of 5: 47.5 µs per loop
Unfortunately there isn't much more we can do, 90% of the time is spent
constructing a dict (even when replacing the dictcomp
{k: dictify(v) for k, v in value.items()}
with map()
+ dict()
).
Migrated from D6319 (view on Phabricator)