Skip to content
Snippets Groups Projects

Handle missing objects in `revision_log()` and `revision_shortlog()`

2 unresolved threads

Handle missing objects in revision_log() and revision_shortlog()

A revision in a log can be missing from the storage. While holes are unusual, they can happen. Dedicated tests were added for the issue.

TODO:

  • A decision needs to be made on what should be returned by revision_short_log() in case of a missing revision.
Edited by Jérémy Bobbio (Lunar)

Merge request reports

Members who can merge are allowed to add commits.

Pipeline #9462 failed

Pipeline failed for 305d83ef on lunar:fix-revision-log-on-missing-objects

Approval is optional
Merge blocked: 1 check failed
Merge request must be rebased, because a fast-forward merge is not possible.

Merge details

  • The source branch is 27 commits behind the target branch.
  • 2 commits will be added to master.
  • Source branch will be deleted.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Jenkins job DSTO/gitlab-builds #665 succeeded .
    See Console Output and Coverage Report for more details.

    • Could you add a test? (insert revisions 2, 3, 4 and not 1 from storage_data; then query revision_log from 3 or 4)

      Edited by vlorentz
    • Thanks for the prompt. The situation seems pretty uneven:

      swh/storage/tests/blocking/test_proxy_blocking.py::TestStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py FAILED                                                             [  9%]
      swh/storage/tests/blocking/test_proxy_no_blocking.py::TestStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py FAILED                                                          [ 18%]
      swh/storage/tests/masking/test_proxy_masking.py::TestStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py FAILED                                                               [ 27%]
      swh/storage/tests/masking/test_proxy_passthrough.py::TestStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py FAILED                                                           [ 36%]
      swh/storage/tests/test_api_client.py::TestStorageApi::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py FAILED                                                                       [ 45%]
      swh/storage/tests/test_cassandra.py::TestCassandraStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py FAILED                                                                  [ 54%]
      swh/storage/tests/test_in_memory.py::TestInMemoryStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py FAILED                                                                   [ 63%]
      swh/storage/tests/test_postgresql.py::TestStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py PASSED                                                                          [ 72%]
      swh/storage/tests/test_postgresql_flavor_mirror.py::TestStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py PASSED                                                            [ 81%]
      swh/storage/tests/test_postgresql_flavor_readreplica.py::TestStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py PASSED                                                       [ 90%]
      swh/storage/tests/test_tenacious.py::TestTenaciousStorage::test_revision_log_missing_revision <- swh/storage/tests/storage_tests.py FAILED                                                                  [100%]
    • Please register or sign in to reply
  • added 1 commit

    • a99153b9 - postgresql: handle missing objects in revision_log()

    Compare with previous version

  • Jenkins job DSTO/gitlab-builds #666 failed .
    See Console Output and Coverage Report for more details.

  • added 2 commits

    • e407ba2f - Fix flaky revision_log related tests
    • 305d83ef - Handle missing objects in revision_log() and revision_shortlog()

    Compare with previous version

  • Jérémy Bobbio (Lunar) changed title from postgresql: handle missing objects in revision_log() to Hndle missing objects in {++}revision_log(){+ and revision_shortlog()+}

    changed title from postgresql: handle missing objects in revision_log() to Hndle missing objects in {++}revision_log(){+ and revision_shortlog()+}

  • Jérémy Bobbio (Lunar) changed the description

    changed the description

  • Jenkins job DSTO/gitlab-builds #667 failed .
    See Console Output and Coverage Report for more details.

  • Returning Nones in the middle of a list of revisions seems a bit weird to me (I know it's consistent with the behavior of other endpoints, but ugh).

    These two revision_log endpoints are paginated by their very nature, so clients need to already be able to resume a query from incomplete results. With that being given, I would suggest just truncating the results at the "earliest" revision in the history for which we have information in the revision table.

    Assuming a linear history with revisions A (A.parents == ()), B (B.parents == (A.id,)), C (C.parents == (B.id,)), if revision A is missing from storage, then:

    revision_log([C.id]) should return results equivalent to [C, B] revision_shortlog([C.id]) should return results similar to [(C.id, (B.id,)), (B.id, (A.id,))]

    Noticing this, a client would be able to try to resume iterating with A.id, and notice that the revision is missing somehow which could then return an error.

    • Alternatively, we could deprecate these endpoints, and create others that would have a composite return value which would be able to report this error directly (making it explicit that there's a distinction between "limit reached" and "objects missing"). But that's lots of churn for what should be, effectively, a rare edge case.

    • But that's lots of churn for what should be, effectively, a rare edge case.

      we could use this as an opportunity to make revision_log()'s successor return model objects instead of dicts (see also !1117)

    • not the worst idea!

      Edited by Nicolas Dandrimont
    • agree, this is probably the best path (even if not the shortest).

    • Please register or sign in to reply
  • mentioned in merge request swh-alter!41 (merged)

  • vlorentz changed title from Hndle missing objects in revision_log() and revision_shortlog() to Handle missing objects in revision_log() and revision_shortlog()

    changed title from Hndle missing objects in revision_log() and revision_shortlog() to Handle missing objects in revision_log() and revision_shortlog()

Please register or sign in to reply
Loading