license dataset: missing java stuff from the replication package
Up to the 2021 version of the dataset we used to have the Java source code of custom code used to, e.g., find the earliest occurrence of a license blob, as part of the dataset in a java/ subdir. This seems to be gone from the 2022 version. We should add it back (ideally; or else we can point to the code used for that as part of swh-graph, but that would make the replication package a bit less useful in its own).
Migrated from T4682 (view on Phabricator)