Skip to content
Snippets Groups Projects
  1. Nov 29, 2024
    • Nicolas Dandrimont's avatar
      Implement a separate kafka communication thread for journal clients · c674a974
      Nicolas Dandrimont authored
      This communication thread is in charge of pulling the messages from
      kafka and handing them off to a processing thread, as well as doing
      regular polling of the rdkafka client (which in turn notifies the
      brokers that the consumer is still alive).
      
      Doing this allows the kafka communication thread to pause the kafka
      consumption explicitly when processing a batch of messages takes too
      long. This can in turn avoid a lot of rebalance traffic on the kafka
      brokers, and overall avoids a bunch of internal rdkafka timeouts.
      c674a974
  2. Nov 22, 2024
  3. Nov 21, 2024
  4. Aug 30, 2024
  5. Aug 27, 2024
  6. Jul 22, 2024
  7. Jun 03, 2024
  8. May 17, 2024
    • Pierre-Yves David's avatar
      model: adapt to 6.13 · 50990232
      Pierre-Yves David authored
      It is no longer possible to instantiate such classes, so this tests are
      no longer needed (nor are they working).
      50990232
  9. Apr 24, 2024
    • Vincent Sellier's avatar
      Fix on_eof behavior when the consumer group is rebalanced · e625c64c
      Vincent Sellier authored
      When a CG is rebalanced, no messages are delivered to the consumer
      until the stabilization so a timeout occurs in the consume() method.
      The empty list is considered as we are at the end of the partition.
      The result is all the consumers of the group end at the same time
      if the rebalancing takes more than 10s.
      
      The fix changes the behavior to wait until a list of partitions are
      assigned to the consumer before testing if the end is reached.
      e625c64c
  10. Mar 29, 2024
  11. Feb 06, 2024
  12. Dec 05, 2023
  13. Dec 04, 2023
  14. Dec 03, 2023
  15. Nov 28, 2023
  16. Nov 20, 2023
  17. Nov 16, 2023
    • Jérémy Bobbio (Lunar)'s avatar
      Add support for object deletion to KafkaJournalWriter · 3b9564f3
      Jérémy Bobbio (Lunar) authored
      “Deleting” an event in Kafka is a two-step process. First,
      a new event is added for the key to be deleted with `null` as its
      value. Such events are known as tombstones. When topics are configured
      to use compaction, older events will actually be deleted after specific
      thresholds have been reached.
      
      Tombstones themselves usually also linger for a while in a topic. This
      gives a chance for consumers to learn that a given key has been deleted.
      This is configured by `delete.retention.ms`. For Software Heritage, we
      should still not rely on consumers of the journal actually seeing these
      tombstones to handle object deletions. If they lag too much, the
      tombstone will eventually be removed (together with the actual data)
      from the journal. This shall be handled by #4658 instead.
      
      Normally, compaction will be triggered when the ratio of dirty data to
      total data reaches the threshold set by the `min.cleanable.dirty.ratio`
      configuration. `min.compaction.lag.ms` can be set to prevent overly
      aggressive cleaning. This provides a minimum period of time for
      applications to see an event prior to its deletion.
      `max.compaction.lag.ms` sets the time limit before triggering a
      compaction, regardless of the amount of dirty data.
      
      For more information see:
      https://developer.confluent.io/courses/architecture/compaction/
      
      The `delete` method is only implemented for KafkaJournalWriter because
      the semantics are so closely aligned with Kafka’s.
      
      Based on the initial merge request !233 written by olasd.
      
      Closes: #4657
      3b9564f3
  18. Sep 04, 2023
    • Antoine Lambert's avatar
      writer/inmemory: Disable shared memory use by default · a521d738
      Antoine Lambert authored
      Turn shared memory use optional in journal writer memory backend and
      disable its use by default.
      
      Such backend is typically created in tests of swh packages but only
      a single instance is used so enabling shared memory is not required.
      
      This brings a great speedup when executing tests using a journal
      writer with memory backend but also prevent flaky tests.
      a521d738
  19. Jul 24, 2023
  20. Jul 12, 2023
  21. May 15, 2023
  22. May 05, 2023
  23. May 04, 2023
  24. Mar 13, 2023
  25. Mar 03, 2023
  26. Feb 17, 2023
  27. Feb 16, 2023
  28. Feb 02, 2023
  29. Dec 19, 2022
  30. Nov 15, 2022
  31. Oct 25, 2022
  32. Oct 21, 2022
  33. Oct 18, 2022
  34. Jun 16, 2022
  35. Jun 08, 2022
Loading