Skip to content
Snippets Groups Projects

utils.split_range: Make computed ranges not overlap

Existing listers use the is_within_bound [1] method from the base lister. This method uses inclusive boundaries in all cases.

As some "range" task listers [2] [3] are using split_range function to create "overlapping" ranges, this can be the concurrent insert issue cause we found [4].

This commit adapts the function split_range to make the generated ranges no longer overlap.

Related to #2577 (closed)

Test Plan

tox


Migrated from D3899 (view on Phabricator)

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Build has FAILED

    Patch application report for D3899 (id=13743)

    Could not rebase; Attempt merge onto 66a61f3d...

    Updating 66a61f3..e407feb
    Fast-forward
     swh/lister/tests/test_utils.py | 37 +++++++++++++++++++++----------------
     swh/lister/utils.py            |  8 +++++---
     2 files changed, 26 insertions(+), 19 deletions(-)
    Changes applied before test
    commit e407feb8c28b805db8ba220080d20f56d3287c50
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Wed Sep 9 18:50:46 2020 +0200
    
        utils.split_range: Make computed ranges not overlap
        
        Existing listers use the `is_within_bound` [1] method from the base lister.
        This method uses inclusive boundaries in all cases.
        
        As some "range" task listers [2] [3] are using `split_range` function to create
        "overlapping" ranges. So, when those range overlap, this can create concurrent
        insert issues down the line.
        
        This commit adapts the function `split_range` to make the generated ranges no
        longer overlap.
        
        - [1]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/core/lister_base.py$194-199
        
        - [2]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitlab/tasks.py$37-41
        
        - [3]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitea/tasks.py$36-41
        
        Related to #2577
    
    commit 725c1fe4ad076c71d51d0c2998dcb4aaedd4b6bb
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Wed Sep 9 18:48:07 2020 +0200
    
        test_utils: Migrate to pytest

    Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/41/ See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/41/console

    • Fix remaining task tests (on relister which uses ranges)
    • Add docstring on utils.split_range (with samples)
  • Build is green

    Patch application report for D3899 (id=13745)

    Could not rebase; Attempt merge onto 66a61f3d...

    Updating 66a61f3..6b880bc
    Fast-forward
     swh/lister/gitea/tests/test_tasks.py  | 33 +++++++++++++------------------
     swh/lister/gitlab/tests/test_tasks.py | 33 +++++++++++++------------------
     swh/lister/tests/test_utils.py        | 37 ++++++++++++++++++++---------------
     swh/lister/utils.py                   | 21 +++++++++++++++++---
     4 files changed, 67 insertions(+), 57 deletions(-)
    Changes applied before test
    commit 6b880bc00225ac05a94f001d6591ee394823a039
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Wed Sep 9 18:50:46 2020 +0200
    
        utils.split_range: Split into not overlapping ranges
        
        Existing listers use the `is_within_bound` [1] method from the base lister.
        This method uses inclusive boundaries in all cases.
        
        As some "range" task listers [2] [3] are using `split_range` function to create
        "overlapping" ranges. So, when those range overlap, this can create concurrent
        insert issues down the line.
        
        This commit adapts the function `split_range` to make the generated ranges no
        longer overlap.
        
        - [1]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/core/lister_base.py$194-199
        
        - [2]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitlab/tasks.py$37-41
        
        - [3]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitea/tasks.py$36-41
        
        Related to #2577
    
    commit 725c1fe4ad076c71d51d0c2998dcb4aaedd4b6bb
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Wed Sep 9 18:48:07 2020 +0200
    
        test_utils: Migrate to pytest

    See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/42/ for more details.

  • add --doctest-modules to the arguments of pytest in tox.ini

  • Vincent Sellier mentioned in merge request !158 (closed)

    mentioned in merge request !158 (closed)

  • Vincent Sellier mentioned in merge request !331 (closed)

    mentioned in merge request !331 (closed)

  • Tested in the docker-environment, the problem is not reproduced anymore with 5 concurrent listers.

  • Merge request was accepted

  • Vincent Sellier approved this merge request

    approved this merge request

  • add --doctest-modules to the arguments of pytest in tox.ini

    ack, i'll do it in another diff as other unrelated part broke when adding the pytest --doctest-module flag.

    Tested in the docker-environment, the problem is not reproduced anymore with 5 concurrent listers.

    great ;)

  • Rework commit message

  • Build is green

    Patch application report for D3899 (id=13753)

    Rebasing onto 725c1fe4...

    Current branch diff-target is up to date.
    Changes applied before test
    commit e3c856b5eef574427533fdb682163087337b2d8c
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Wed Sep 9 18:50:46 2020 +0200
    
        utils.split_range: Split into not overlapping ranges
        
        Existing listers use the `is_within_bound` [1] method from the base lister.
        This method uses inclusive boundaries in all cases.
        
        As some "range" task listers [2] [3] are using `split_range` function to create
        "overlapping" ranges, this can cause concurrent insert issues down the line [4].
        
        This commit adapts the function `split_range` to make the generated ranges no
        longer overlap.
        
        - [1]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/core/lister_base.py$194-199
        
        - [2]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitlab/tasks.py$37-41
        
        - [3]
        https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitea/tasks.py$36-41
        
        Related to #2577

    See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/43/ for more details.

  • Merge request was merged

  • mentioned in commit d1ce9b09

Please register or sign in to reply
Loading