Skip to content
Snippets Groups Projects

scheduler-recurrent: Adapt scheduling default policy so origins without last update get regularly scheduled

Merged Antoine R. Dumont requested to merge staging-fix-schedule-recurrent-config into production
2 unresolved threads

They are currently not listed.

(octo-diff would not work because i used the wrong branch to compare... ;)

[1]

$SWH_PUPPET_ENVIRONMENT_HOME/bin/octocatalog-diff --to staging-fix-schedule-recurrent-config scheduler
0
Found host scheduler0.internal.staging.swh.network
Cloning into '/tmp/swh-ocd.Tk5NYTYc/swh-site'...
done.
branch 'staging-fix-schedule-recurrent-config' set up to track 'origin/staging-fix-schedule-recurrent-config'.
Switched to a new branch 'staging-fix-schedule-recurrent-config'
WARN     -> Environment "staging-fix-schedule-recurrent-config" contained non-word characters, correcting name to staging_fix_schedule_recurrent_config
Cloning into '/tmp/swh-ocd.Tk5NYTYc/environments/production/data/private'...
done.
Cloning into '/tmp/swh-ocd.Tk5NYTYc/environments/staging_fix_schedule_recurrent_config/data/private'...
done.
*** Running octocatalog-diff on host scheduler0.internal.staging.swh.network
I, [2023-07-04T16:23:49.725788 #1046096]  INFO -- : Catalogs compiled for scheduler0.internal.staging.swh.network
I, [2023-07-04T16:23:50.111775 #1046096]  INFO -- : Diffs computed for scheduler0.internal.staging.swh.network
diff origin/production/scheduler0.internal.staging.swh.network current/scheduler0.internal.staging.swh.network
*******************************************
  File[/etc/softwareheritage/scheduler/listener-runner.yml] =>
   parameters =>
     content =>
      @@ -6,3 +6,14 @@
       celery:
         task_broker: amqp://guest:guest@127.0.0.1:5672/%2f
      +scheduling_policy:
      +  default:
      +  - policy: already_visited_order_by_lag
      +    weight: 40
      +  - policy: never_visited_oldest_update_first
      +    weight: 40
      +  - policy: origins_without_last_update
      +    weight: 20
      +  opam:
      +  - policy: origins_without_last_update
      +    weight: 100
      _
*******************************************
*** End octocatalog-diff on scheduler0.internal.staging.swh.network

Refs. swh/infra/sysadm-environment#4971 (closed)

Edited by Antoine R. Dumont

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
    • I don't think you need all that complexity. the scheduling policies can just go into the main scheduler config file (they'll be ignored by the tools that don't need them):

      commit 9025fc8a7b267c5f88d608523eca1e93dbee6231
      Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
      Date:   Tue Jul 4 15:43:13 2023 +0200
      
          Update default scheduling policy
      
      diff --git a/data/common/common.yaml b/data/common/common.yaml
      index 5a2e55fd..9a8c69f4 100644
      --- a/data/common/common.yaml
      +++ b/data/common/common.yaml
      @@ -2672,6 +2672,14 @@ swh::deploy::scheduler::config:
         <<: *swh_scheduler_local_config
         celery:
           task_broker: "%{alias('swh::deploy::scheduler::task_broker')}"
      +  scheduling_policies:
      +    default:
      +      - policy: already_visited_order_by_lag
      +        weight: 40
      +      - policy: never_visited_oldest_update_first
      +        weight: 40
      +      - policy: origins_without_last_update
      +        weight: 20
       swh::deploy::scheduler::packages:
         - python3-swh.lister
         - python3-swh.loader.bzr

      seems to add the expected configs to saatchi and scheduler0.

    • I don't think you need all that complexity. the scheduling policies can just go into the main scheduler config file (they'll be ignored by the tools that don't need them):

      I was unsure it'll get ignored, great to know. I'll adapt.

      I don't want to touch to the default config though.

      (fwiw, I think that if the swh.scheduler default is inadequate, which it seems, it should be changed there as well)

      Well i guess i can adapt according to what you suggested.

      In any case though, as opam origins don't have any last update at all [1], is that ok to have the opam's default policy only with the origins_without_last_update with a weight of 100?

      [1] I asked @anlambert and his tryouts were not concluant as that cannot be inferred consistently

    • In practice the last scheduling policy will try to fill all available slots, so it shouldn't make much difference. Of course if the other two policies are useless, no point having them (until we manage to infer a last update for these origins, and we end up noticing three years later that they're not getting scheduled anymore)

    • Please register or sign in to reply
  • (fwiw, I think that if the swh.scheduler default is inadequate, which it seems, it should be changed there as well)

  • added 1 commit

    • e156fef1 - scheduler-recurrent: Adapt default scheduling policy & add specific opam policy

    Compare with previous version

    • In your updated commit message, you wrote:

      The current default policy is not appropriate as too few origins without last update are scheduled.

      I guess that's technically accurate, but the current default is to not schedule origins without last update at all, so that's a bit misleading: "too few" somewhat implies that we're lagging, but we're really not doing it at all.

      Is this also an issue for non-opam listers? If it is, we should definitely be changing the swh.scheduler defaults. How should we be monitoring this so that it doesn't happen again?

    • I guess that's technically accurate, but the current default is to not schedule origins without last update at all, so that's a bit misleading: "too few" somewhat implies that we're lagging, but we're really not doing it at all.

      Right, i had forgotten we enforced it hence my accurate must mislead statement. I'll probably change in 2 commits then. One for the defaults policy (with a proper message) so it can schedule some non last update origin policy. And another for opam.

      Is this also an issue for non-opam listers?

      I guess it is as we do have few other listers which do not have any last update. After checking some, bower does not have any, conda is not guaranted to have a last update, nor is cran, and I stopped there. I recall we are trying to enforce its use but sometimes the information is just not there (during reviews or developments).

      If it is, we should definitely be changing the swh.scheduler defaults.

      Yes, we should change it but maybe 20 (as per my last change) is a bit much maybe.

      How should we be monitoring this so that it doesn't happen again?

      That, I don't know.

      Note that I also recall having a mixed feeling about the scheduling on last update policy which tend to create a high number of visits. And i don't know how to reconciliate that with this mr either...

      Edited by Antoine R. Dumont
    • Note that I also recall having a mixed feeling about the scheduling on last update policy which tend to create a high number of visits. And i don't know how to reconciliate that with this mr either...

      How so? I should only generate either zero (no update to the last_update field) or one (last_update updated) visit for each origin for each run of the lister.

    • How so? I should only generate either zero (no update to the last_update field) or one (last_update updated) visit for each origin for each run of the lister.

      I may be misremembering but I saw a long time ago, origins getting scheduled in a loop. And those origins were corresponding to the one we know with a high number of visits. Saying that, it might simply have been the integration save-code-now checks which triggered a lot for those. And with time, i conflated the two.

      Thanks for making me think back on that.

    • Please register or sign in to reply
  • Antoine R. Dumont added 2 commits

    added 2 commits

    • 40d73e38 - recurrent: Adapt default policy so origins without last update are scheduled
    • 337fed6a - recurrent: Add opam policy so all origins are regularly scheduled

    Compare with previous version

  • Antoine R. Dumont changed the description

    changed the description

  • Nicolas Dandrimont approved this merge request

    approved this merge request

  • Antoine R. Dumont changed title from scheduler-recurrent: Adapt scheduling policy so opam origins get listed to scheduler-recurrent: Adapt scheduling default policy so origins without last update get regularly scheduled

    changed title from scheduler-recurrent: Adapt scheduling policy so opam origins get listed to scheduler-recurrent: Adapt scheduling default policy so origins without last update get regularly scheduled

  • I'll land this once I have updated the opam stack first (to avoid failing ingestions).

  • Tested in staging and it does the job:

    Jul 04 15:25:30 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 1000 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
    Jul 04 15:25:44 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 1000 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
    Jul 04 15:28:54 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 390 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
    Jul 04 15:28:58 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 490 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
    Jul 04 15:29:03 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 591 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
    Jul 04 15:29:09 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 300 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
    Jul 04 15:30:21 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 279 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
    Jul 04 15:30:24 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 279 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
    Jul 04 15:30:32 scheduler0 swh[2131415]: INFO:swh.scheduler.celery_backend.recurrent_visits:opam: 100 visits scheduled in queue swh.loader.package.opam.tasks.LoadOpam
Please register or sign in to reply
Loading