Add support for model object anonymization
Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable.
For Revision, Release and Person, the method do return an anonymized version of the object.
Replaces (partially) the couple swh-journal!172 (closed)/swh-storage!400 (closed).
See swh-journal!173 (closed) for the part in swh.journal.
Migrated from D3171 (view on Phabricator)
Merge request reports
Activity
Build is green
Patch application report for D3171 (id=11258)
Rebasing onto cce30366...
Current branch diff-target is up to date.
Changes applied before test
commit 0292f52f53294f2a5e809218d67557940abeb34e Author: David Douard <david.douard@sdfa3.org> Date: Tue May 19 16:04:30 2020 +0200 Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Revision, Release and Person, the method do return an anonymized version of the object.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/60/ for more details.
mentioned in merge request swh-journal!172 (closed)
mentioned in merge request swh-storage!400 (closed)
129 140 130 141 return Person(name=name or None, email=email or None, fullname=fullname,) 131 142 143 def anonymize(self) -> "Person": 144 """Returns an anonymized version of the Person object. 145 146 Anonymization is simply a Person which fullname is the hashed, with unset name 147 or email. ! In !251 (closed), @vlorentz wrote: Shouldn't we make anonymized objects error when their
compute_hash()
method is called?Maybe, but that would require we keep the info "this is an anonymized object" somewhere, which is not the case for now. This idea can be dealt later, maybe?
! In !251 (closed), @douardda wrote: Maybe, but that would require we keep the info "this is an anonymized object" somewhere, which is not the case for now. This idea can be dealt later, maybe?
That's why I'm asking now, so you/we don't have to do some code changes later. But if you're comfortable with it, then fine.
Build is green
Patch application report for D3171 (id=11267)
Rebasing onto cce30366...
Current branch diff-target is up to date.
Changes applied before test
commit e40fe471031bc85f9d40be163cba9d7351a02888 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 19 16:04:30 2020 +0200 Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Person, the method returns a Person whith hashed fullname (and unset name and email). For Revision and Release, the method returns an anonymized version of the object, i.e. with instance of Person replaced by anonymized ones.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/61/ for more details.
93 93 return "%s://%s" % (protocol, domain) 94 94 95 95 96 def persons_d(): 97 return builds( 98 dict, fullname=binary(), email=optional(binary()), name=optional(binary()), 99 ) 96 @composite 97 def persons_d(draw): 98 fullname = draw(binary()) 99 email = draw(optional(binary())) 100 name = draw(optional(binary())) 101 assume(not (len(fullname) == 32 and email is None and name is None)) 102 return dict(fullname=fullname, name=name, email=email) Build is green
Patch application report for D3171 (id=11268)
Rebasing onto cce30366...
Current branch diff-target is up to date.
Changes applied before test
commit 0f3af381835fc2f1e3e420519d0bba7aef3d8ce6 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 19 16:04:30 2020 +0200 Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Person, the method returns a Person whith hashed fullname (and unset name and email). For Revision and Release, the method returns an anonymized version of the object, i.e. with instance of Person replaced by anonymized ones.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/62/ for more details.
Build is green
Patch application report for D3171 (id=11270)
Rebasing onto cce30366...
Current branch diff-target is up to date.
Changes applied before test
commit 29312dff6d96ac1c9bc18bf98de1d2e27a76c334 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 19 16:04:30 2020 +0200 Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Person, the method returns a Person whith hashed fullname (and unset name and email). For Revision and Release, the method returns an anonymized version of the object, i.e. with instance of Person replaced by anonymized ones.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/63/ for more details.