package loaders: Make test failures more helpful
Some tests did the following:
- build a snapshot
- get the snapshot from the storage
- compare it with the expected snapshot
- get the origin visit from the storage and check it
If the loader built a wrong snapshot, the test fails at step 2, and the only information displayed is that the expected snapshot id does not exist, which is very unhelpful.
Instead, I reordered them as: 1, 4, 2, 3. This way, if a wrong snapshot is build by the loader, it is detected when comparing the visit, and pytest shows the two hashes. Then, the test can be modified to use the hash that is actually generated to show the actual snapshot.
This is consistent with what was already done in the pypi loader.
Additionally, I made the following changes:
- always check stats last (because a difference in numbers is hardly actionable without testing other objects)
- add a few more snapshot id checks in visits
- deduplicated a hardcoded snapshot id.
Migrated from D6607 (view on Phabricator)
Merge request reports
Activity
Build has FAILED
Patch application report for D6607 (id=24000)
Could not rebase; Attempt merge onto 5063082e...
Updating 5063082..cde82c2 Fast-forward docs/package-loader-tutorial.rst | 25 +++ swh/loader/package/archive/tests/test_archive.py | 20 +-- swh/loader/package/cran/tests/test_cran.py | 20 ++- swh/loader/package/debian/tests/test_debian.py | 80 +++++---- swh/loader/package/deposit/loader.py | 5 - swh/loader/package/deposit/tests/test_deposit.py | 78 +++++++-- swh/loader/package/nixguix/tests/test_nixguix.py | 31 +++- swh/loader/package/npm/tests/test_npm.py | 83 ++++----- swh/loader/package/opam/loader.py | 41 ++++- swh/loader/package/opam/tests/test_opam.py | 83 ++++++++- swh/loader/package/pypi/tests/test_pypi.py | 205 ++++++++++++----------- swh/loader/tests/__init__.py | 16 +- 12 files changed, 455 insertions(+), 232 deletions(-)
Changes applied before test
commit cde82c24765a77b74e04940f12e9446dd7ae05b3 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 15:49:45 2021 +0100 package loaders: Make test failures more helpful Some tests did the following: 1. build a snapshot 2. get the snapshot from the storage 3. compare it with the expected snapshot 4. get the origin visit from the storage and check it If the loader built a wrong snapshot, the test fails at step 2, and the only information displayed is that the expected snapshot id does not exist, which is very unhelpful. Instead, I reordered them as: 1, 4, 2, 3. This way, if a wrong snapshot is build by the loader, it is detected when comparing the visit, and pytest shows the two hashes. Then, the test can be modified to use the hash that is actually generated to show the actual snapshot. This is consistent with what was already done in the pypi loader. Additionally, I made the following changes: 1. always check stats last (because a difference in numbers is hardly actionable without testing other objects) 2. add a few more snapshot id checks in visits commit 65ee04b10025474c59e0521aab7b847a74b38b00 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Oct 21 14:35:47 2021 +0200 deposit: Remove 'parent' deposit The parent is computed by the deposit as the revision of the latest deposit in the same origin before the current one. Therefore, it is redundant, as it can be recomputed from metadata + revision date. This is a preliminary change needed to make package loaders produce releases instead of revisions, as releases don't have parent relationships commit 0d04182ecce3ab34c2ed648bbafa83ab5204e0b0 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 11:45:36 2021 +0100 opam: Write package definitions to the extrinsic metadata storage commit 18bbbae719fc9d165d5b543e54d449f5befc083a Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 11:43:53 2021 +0100 Add missing documentation for `get_metadata_authority`.
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/594/ See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/594/console
Build has FAILED
Patch application report for D6607 (id=24001)
Could not rebase; Attempt merge onto 5063082e...
Updating 5063082..91b4906 Fast-forward docs/package-loader-tutorial.rst | 25 +++ swh/loader/package/archive/tests/test_archive.py | 20 +-- swh/loader/package/cran/tests/test_cran.py | 20 ++- swh/loader/package/debian/tests/test_debian.py | 80 +++++---- swh/loader/package/deposit/loader.py | 5 - swh/loader/package/deposit/tests/test_deposit.py | 78 +++++++-- swh/loader/package/nixguix/tests/test_nixguix.py | 31 +++- swh/loader/package/npm/tests/test_npm.py | 83 ++++----- swh/loader/package/opam/loader.py | 41 ++++- swh/loader/package/opam/tests/test_opam.py | 83 ++++++++- swh/loader/package/pypi/tests/test_pypi.py | 205 ++++++++++++----------- swh/loader/tests/__init__.py | 16 +- 12 files changed, 455 insertions(+), 232 deletions(-)
Changes applied before test
commit 91b49060e86340b7db8a9737b21f41c84261b1ec Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 15:49:45 2021 +0100 package loaders: Make test failures more helpful Some tests did the following: 1. build a snapshot 2. get the snapshot from the storage 3. compare it with the expected snapshot 4. get the origin visit from the storage and check it If the loader built a wrong snapshot, the test fails at step 2, and the only information displayed is that the expected snapshot id does not exist, which is very unhelpful. Instead, I reordered them as: 1, 4, 2, 3. This way, if a wrong snapshot is build by the loader, it is detected when comparing the visit, and pytest shows the two hashes. Then, the test can be modified to use the hash that is actually generated to show the actual snapshot. This is consistent with what was already done in the pypi loader. Additionally, I made the following changes: 1. always check stats last (because a difference in numbers is hardly actionable without testing other objects) 2. add a few more snapshot id checks in visits 3. deduplicated a hardcoded snapshot id. commit 65ee04b10025474c59e0521aab7b847a74b38b00 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Oct 21 14:35:47 2021 +0200 deposit: Remove 'parent' deposit The parent is computed by the deposit as the revision of the latest deposit in the same origin before the current one. Therefore, it is redundant, as it can be recomputed from metadata + revision date. This is a preliminary change needed to make package loaders produce releases instead of revisions, as releases don't have parent relationships commit 0d04182ecce3ab34c2ed648bbafa83ab5204e0b0 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 11:45:36 2021 +0100 opam: Write package definitions to the extrinsic metadata storage commit 18bbbae719fc9d165d5b543e54d449f5befc083a Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 11:43:53 2021 +0100 Add missing documentation for `get_metadata_authority`.
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/595/ See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/595/console
Build is green
Patch application report for D6607 (id=24003)
Could not rebase; Attempt merge onto 5063082e...
Updating 5063082..dffd638 Fast-forward docs/package-loader-tutorial.rst | 25 +++ swh/loader/package/archive/tests/test_archive.py | 20 +-- swh/loader/package/cran/tests/test_cran.py | 20 ++- swh/loader/package/debian/tests/test_debian.py | 80 +++++---- swh/loader/package/deposit/loader.py | 5 - swh/loader/package/deposit/tests/test_deposit.py | 78 +++++++-- swh/loader/package/nixguix/tests/test_nixguix.py | 31 +++- swh/loader/package/npm/tests/test_npm.py | 83 ++++----- swh/loader/package/opam/loader.py | 41 ++++- swh/loader/package/opam/tests/test_opam.py | 83 ++++++++- swh/loader/package/pypi/tests/test_pypi.py | 205 ++++++++++++----------- swh/loader/tests/__init__.py | 16 +- swh/loader/tests/test_init.py | 4 +- 13 files changed, 458 insertions(+), 233 deletions(-)
Changes applied before test
commit dffd6387dfcdc4bb535c13916299d246c0f4d169 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 15:49:45 2021 +0100 package loaders: Make test failures more helpful Some tests did the following: 1. build a snapshot 2. get the snapshot from the storage 3. compare it with the expected snapshot 4. get the origin visit from the storage and check it If the loader built a wrong snapshot, the test fails at step 2, and the only information displayed is that the expected snapshot id does not exist, which is very unhelpful. Instead, I reordered them as: 1, 4, 2, 3. This way, if a wrong snapshot is build by the loader, it is detected when comparing the visit, and pytest shows the two hashes. Then, the test can be modified to use the hash that is actually generated to show the actual snapshot. This is consistent with what was already done in the pypi loader. Additionally, I made the following changes: 1. always check stats last (because a difference in numbers is hardly actionable without testing other objects) 2. add a few more snapshot id checks in visits 3. deduplicated a hardcoded snapshot id. commit 65ee04b10025474c59e0521aab7b847a74b38b00 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Oct 21 14:35:47 2021 +0200 deposit: Remove 'parent' deposit The parent is computed by the deposit as the revision of the latest deposit in the same origin before the current one. Therefore, it is redundant, as it can be recomputed from metadata + revision date. This is a preliminary change needed to make package loaders produce releases instead of revisions, as releases don't have parent relationships commit 0d04182ecce3ab34c2ed648bbafa83ab5204e0b0 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 11:45:36 2021 +0100 opam: Write package definitions to the extrinsic metadata storage commit 18bbbae719fc9d165d5b543e54d449f5befc083a Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 4 11:43:53 2021 +0100 Add missing documentation for `get_metadata_authority`.
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/596/ for more details.
assert_last_visit_matches
andcheck_snapshot
should probably be merged somehow:it would make sense to show the (full) diff between the branches of the expected snapshot and the actual snapshot added to the storage (regardless of
id
), rather than just having what is essentially a boolean "yes, the snapshot matches" (which is essentially what happens when the ids are checked for equality).Overall I think your logic is sound, I just think it should go one step further.
(maybe making the diff recursive, i.e. "these branches are different, and here are the objects they point to" would make sense as well, but I don't think we have a lot of deep structure for the objects that we expect to have archived in these tests)
Sure! But I'd rather do it later -> #3704
mentioned in merge request !236 (closed)
mentioned in merge request !237 (closed)
mentioned in issue #3704