Skip to content
Snippets Groups Projects
  1. Mar 31, 2025
  2. Mar 28, 2025
  3. Mar 26, 2025
  4. Mar 21, 2025
    • Pierre-Yves David's avatar
      Migration psycopg3 · af1245b0
      Pierre-Yves David authored and Nicolas Dandrimont's avatar Nicolas Dandrimont committed
      The adapter registration become simpler, as psycopg understand enum
      natively.
      
      Most change are about the new parameter substitution.
      
      We no longer need to raise Scheduler error if we have multiple Visit
      update for the same URl. It seemed fine so I did not make extra work to
      make use it happens.
      
      On test was related to psycopg2 page size, we kept it to test "large"
      batch.
      af1245b0
  5. Mar 17, 2025
    • Antoine R. Dumont's avatar
      scheduler/runner: Allow specifying task type patterns to schedule · 5f94aa08
      Antoine R. Dumont authored
      If none is provided, the current behavior is as before. When providing
      patterns, the list of task types is filtered to only allow the task types
      which starts with the patterns.
      
      Refs. swh/infra/sysadm-environment#5512
      Unverified
      5f94aa08
    • Antoine R. Dumont's avatar
      simulator: Fix timestamps manipulation · 2b7e34b2
      Antoine R. Dumont authored
      It's unclear whether the change in this mr triggers the bug in the simulator.
      But it should be timestamp that needs manipulation and it's currently
      datetimes. So this makes the current state fails with [1]
      
      Adding the conversion layer from datetime to timestamps make the tests happier.
      
      [1]
      ```
      16:17:06  low = datetime.datetime(2025, 3, 14, 15, 17, 2, 139077, tzinfo=datetime.timezone.utc)
      16:17:06  high = datetime.datetime(2025, 3, 14, 15, 17, 2, 139077, tzinfo=datetime.timezone.utc)
      16:17:06
      16:17:06      def _diff(low, high):
      16:17:06          if low == high:
      16:17:06              if low == 0:
      16:17:06                  return 0.5
      16:17:06              else:
      16:17:06  >               return abs(low * 0.1)
      16:17:06  E               TypeError: unsupported operand type(s) for *: 'datetime.datetime' and 'float'
      16:17:06
      ```
      
      Refs. swh/infra/sysadm-environment#5512
      Unverified
      2b7e34b2
    • Antoine R. Dumont's avatar
      runner: Update task status only after sending the tasks to rabbitmq · 8445489a
      Antoine R. Dumont authored
      Prior to this, the runner called the `grab_ready{_priority}_tasks` method.
      Those method update the task's status to 'next_run_scheduled' at the listing
      time. So it actually writes immediately to postgresql.
      
      So, failing to write to rabbitmq would update the status anyway. So we change
      the runner's calls to use the `peek_ready{_priority}_tasks` methods
      instead. This now only gets the task list to schedule. And at the end of the
      runner, there is a call of `mass_schedule_task_runs` method. This method is
      now in charge to update the tasks' status to 'next_run_scheduled' within the
      same transaction.
      
      Refs. swh/infra/sysadm-environment#5512
      Unverified
      8445489a
    • Antoine R. Dumont's avatar
      celery/runner: Change write order to rabbitmq then postgresql · 259c70f3
      Antoine R. Dumont authored
      Messages are now first sent to rabbitmq then postgresql.
      
      In the nominal case where all writes are ok, that changes nothing vs the
      previous implementation (postgresql first then rabbitmq).
      
      In degraded performance though, that's supposedly better.
      
      1. If we cannot write to rabbitmq, then we won't write to postgresql either,
      that function will raise and stop.
      
      2. If we can write to rabbitmq first, then the messages will be consumed
      independently from this. And then, if we cannot write to postgresql (for some
      reason), then we just lose the information we sent the task already. This
      means the same task will be rescheduled and we'll have a go at it again. As
      those kind of tasks are supposed to be idempotent, that should not a major
      issue for their upstream.
      
      Also, those tasks are mostly listers now and they have a state management of
      their own, so that should definitely mostly noops (if the ingestion from the
      previous run went fine). Edge cases scenario like down site will behave as
      before.
      
      Refs. swh/infra/sysadm-environment#5512
      Unverified
      259c70f3
  6. Mar 14, 2025
  7. Mar 05, 2025
  8. Feb 25, 2025
  9. Feb 17, 2025
  10. Feb 10, 2025
  11. Feb 05, 2025
  12. Dec 09, 2024
  13. Nov 29, 2024
  14. Nov 06, 2024
  15. Oct 30, 2024
  16. Oct 28, 2024
  17. Oct 24, 2024
  18. Oct 17, 2024
  19. Oct 14, 2024
    • Antoine Lambert's avatar
      cli/origin: Add schedule-high-priority-first-visits command · 4adc20b0
      Antoine Lambert authored
      This new command in the origin group enables to schedule first
      visits with high priority for origins registered by listers having
      the first_visits_priority_queue attribute set.
      
      The command ensures the visits of all origins registered by such
      listers will be scheduled with high priority after the first listing
      regardless if some have already been scheduled prior it.
      
      Subsequent executions of such listers will no longer trigger visits
      with high priority though, those will be scheduled by the recurrent
      visits runner.
      
      Related to #4687.
      4adc20b0
    • Antoine Lambert's avatar
      interface: Add get_visit_types_for_listed_origins method · 89c99a03
      Antoine Lambert authored
      It allows to return the set of visit types from the origins listed
      by a specific lister.
      
      Related to #4687.
      89c99a03
  20. Oct 09, 2024
    • Antoine Lambert's avatar
      interface: Add with_first_visits_to_schedule parameter to get_listers · 6b266002
      Antoine Lambert authored
      This new optional parameter enables to only return listers whose first
      visits of listed origins must be scheduled with high priority after a
      first listing but were not scheduled yet.
      
      Those types of listers have the first_visits_queue_prefix attribute set.
      
      Related to #4687.
      6b266002
    • Antoine Lambert's avatar
      model: Add new columns to Lister model related to priority scheduling · ccee462b
      Antoine Lambert authored
      In order to implement a new scheduler runner that will schedule first
      visits of listed origins with high priority, add the following new
      columns to the Lister model:
      
      - last_listing_finished_at: Timestamp at which the last execution of
        the lister finished
      
      - first_visits_queue_prefix: Optional prefix of message queue names
        to schedule first visits with high priority
      
      - first_visits_scheduled_at: Timestamp at which all the first visits
        of listed origins with high priority were scheduled
      
      Related to #4687.
      ccee462b
  21. Sep 10, 2024
  22. Aug 30, 2024
Loading