The test graph is too hard to change
As discussed in !547 or swh-fuse!93 (comment 197559), we need to find a way to test new edges as we find them. The options I see:
- Add them to the test graph. Downsides: This bloats the Git repository (because we need to recreate 500kB worth of binary files), and requires changing plenty of swh-graph/swh-datasets/swh-provenance tests (ie. it's a breaking change). See !161 (closed) as an example (from before we split these tests across repositories).
- Add a new test graph every time we find new edge cases. Downsides: This bloats both the Git repository and checkouts (again, 500kB. Or 250kB if we exclude ORC and edge-CSV exports).
- Generate graphs on the fly, as a test- or session-scoped fixture. This would allow each dependent package to have its own test graphs, and run tests on very specific graph. This increases the test suite runtime (5s of wall/cpu time on my machine for the test graph, so probably about 20s on Jenkins)