Skip to content

Add support for anonymized journal topics

This is another approach for implementing anonymized topics. It replaces (with its counterpart in swh-journal, swh-journal!172 (closed)) swh-journal!171 (closed) and !398 (closed).

This uses the new privileged argument of the KakfaJournalWriter.write_addition(s) methods.

Namely, for anonymizable objects (Revision and Released), this will fill the following topics with unmodified objects:

  • {kafka_prefix}_privileged.release and
  • {kafka_prefix}_privileged.revision

whereas the regular topics will be filled with anonymized versions of these objects.

The anonymization process consists simply in forging a Person with the fullname being a hash of the triplet (fullname, name, email) of the original Person in Release and Revision entities.

So the replayer process can be used as is (just have to not replay both standard and anonymized topics at once).


Migrated from D3161 (view on Phabricator)

Merge request reports