- Mar 16, 2023
-
- Mar 14, 2023
-
- Mar 06, 2023
-
-
vlorentz authored
-
- Mar 01, 2023
-
-
vlorentz authored
- Feb 23, 2023
-
-
Jérémy Bobbio (Lunar) authored
GitLab will display the content of the README file when browsing the repository. But in case the file is a symlink, it will display the path pointed by the symlink. There is a 6 year old issue about this: https://gitlab.com/gitlab-org/gitlab/-/issues/15093 We can workaround the issue by having the content at the root of the repository and a symlink to this file in the `docs/` directory. Tested in swh-py-template!27
-
- Feb 17, 2023
-
-
Antoine Lambert authored
Related to swh/meta#4960
-
Antoine Lambert authored
-
- Feb 16, 2023
-
-
Jérémy Bobbio (Lunar) authored
Related to swh/meta#4959
-
- Feb 02, 2023
-
-
Antoine Lambert authored
This fixes python 3.7 support due to poetry, a dependency of isort, that removed support for that Python version in a recent release.
-
- Dec 19, 2022
-
-
vlorentz authored
Before this commit, UploadExportToS3 and DownloadExportFromS3 assumed the set of object types was the same as the set of directories, which is wrong: * for the `edges` format, there is no origin_visit or origin_visit_status directory * for both `edges` and `orc` formats, this was missing relational tables. A possible fix would have been to use the `swh.dataset.relational.TABLES` constant and keep ignoring non-existing dirs in the `edges`, but I decided to simply list directories instead, as it will prevent future issues if we decide to add directories that do not match any table in Athena for whatever reason.
-
vlorentz authored
Stamp files are only useful while building, and not copied to and from S3, so the check failed after a round-trip through S3.
-
Antoine Lambert authored
-
Antoine Lambert authored
-
Antoine Lambert authored
In order to remove warnings about /apidoc/*.rst files being included multiple times in toc when building full swh documentation, prefer to include module indices only when building standalone package documentation. Related to T4496
- Dec 06, 2022
- Nov 29, 2022
-
-
vlorentz authored
Otherwise, UploadToS3, DownloadToS3, and RunAll would conflict with tasks about to be defined in swh.graph; and Luigi requires task names to be globally unique.
- Nov 24, 2022
- Nov 21, 2022
-
-
vlorentz authored
They are only useful while exporting the dataset -- after the export is finished, meta.json is good enough and stamp files only save a couple of minutes when only some objects types are needed (ie. never in practice)
-
- Nov 15, 2022
-
-
vlorentz authored
This will allow running swh-graph tasks easily on machines that didn't export the graph themselves.
-
- Nov 10, 2022
-
-
vlorentz authored
Other tasks will import them in order to depend on tasks defined here
-
vlorentz authored
They are more tuned toward running automatically, as they call each other as needed, and can be imported by workflows defined in other modules (eg. the future swh.graph.luigi module).
-
vlorentz authored
So it can be reused by a Luigi task
-
vlorentz authored
For some reason, using a non-existing database works when working with credentials with unnecessarily high privileges (though it is not clear to me which permissions allow this).
-
- Nov 04, 2022
-
-
vlorentz authored
-
- Nov 03, 2022
-
-
vlorentz authored
- Oct 18, 2022
-
-
David Douard authored
- pre-commit from 4.1.0 to 4.3.0, - codespell from 2.2.1 to 2.2.2, - black from 22.3.0 to 22.10.0 and - flake8 from 4.0.1 to 5.0.4. Also freeze flake8 dependencies. Also change flake8's repo config to github (the gitlab mirror being outdated).
-
- Sep 08, 2022
-
-
vlorentz authored
-
- Aug 29, 2022
- Jul 06, 2022
-
-
Antoine Pietri authored
-
- Jun 21, 2022
-
-
Nicolas Dandrimont authored
We are removing support for the objstorage computing the object id itself.
-
- May 23, 2022
-
-
Antoine Pietri authored
-