Skip to content
Snippets Groups Projects
  1. Sep 02, 2021
  2. Aug 27, 2021
  3. Aug 26, 2021
  4. Aug 18, 2021
  5. Aug 06, 2021
  6. Aug 03, 2021
  7. Jul 30, 2021
  8. Jul 23, 2021
    • Nicolas Dandrimont's avatar
      Only record last_visited and last_successful in origin_visit_stats · 87e66faa
      Nicolas Dandrimont authored
      After using this schema for a while, all queries can be implemented in
      terms of these two timestamps, instead of the four original
      last_eventful, last_uneventful, last_failed and last_notfound
      timestamps.
      
      This ends up simplifying the logic within the journal client, as well as
      that of the grab_next_visits query builder.
      
      To make this change work, we also stop considering out of order messages
      altogether in journal_client. This welcome simplification is an accuracy
      tradeoff that is explained in the updated documentation of the journal
      client:
      
      .. [1] Ignoring out of order messages makes the initialization of the
            origin_visit_status table (from a full journal) less deterministic: only the
            `last_visit`, `last_visit_state` and `last_successful` fields are guaranteed
            to be exact, the `next_position_offset` field is a best effort estimate
            (which should converge once the client has run for a while on in-order
            messages).
      87e66faa
    • Antoine R. Dumont's avatar
      test_journal_client: Unify test assertion like the rest · 3ca0d659
      Antoine R. Dumont authored
      Related to D5917
      3ca0d659
    • Antoine R. Dumont's avatar
      test: Refactor assert_visit_stats_ok to ignore_fields · 8cf2238e
      Antoine R. Dumont authored
      This simplifies and unifies properly the utility test function to compare visit stats.
      8cf2238e
  9. Jul 22, 2021
  10. Jul 06, 2021
    • Antoine R. Dumont's avatar
      journal_client: Compute next position for origin visit · 8c4ae9f1
      Antoine R. Dumont authored
      For origin without any last_update information [1], the journal client is now also in
      charge of moving their next position in the queue for rescheduling. Depending on their
      status, the next position offset and next_visit_queue_position are updated after each
      visit completes:
      
      - if the visit has failed, increase the next visit target by the minimal visit
        interval (to take into account transient loading issues)
      - if the visit is successful, and records some changes, decrease the visit interval
        index by 2 (visit the origin *way* more often).
      - if the visit is successful, and records no changes, increase the visit interval index
        by 1 (visit the origin less often).
      
      We then set the next visit target to its current value + the new visit interval
      multiplied by a random fudge factor (picked in the -/+ 10% range).
      
      The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins
      e.g. when a number of origins from a single hoster are processed at once.
      
      Note that the computations happen for all origins for simplicity and code maintenance
      but it will only be used by a new soon-to-be scheduling policy.
      
      [1] Lister cannot provide it for some reason.
      8c4ae9f1
  11. Jul 01, 2021
  12. Jun 29, 2021
  13. Jun 23, 2021
  14. Jun 22, 2021
Loading