luigi: Dynamically list directories instead of using object_types
Before this commit, UploadExportToS3 and DownloadExportFromS3 assumed the set of object types was the same as the set of directories, which is wrong:
- for the
edges
format, there is no origin_visit or origin_visit_status directory - for both
edges
andorc
formats, this was missing relational tables.
A possible fix would have been to use the swh.dataset.relational.TABLES
constant and keep ignoring non-existing dirs in the edges
, but I decided to
simply list directories instead, as it will prevent future issues if we
decide to add directories that do not match any table in Athena for
whatever reason.
Migrated from D8965 (view on Phabricator)