Skip to content
Snippets Groups Projects
  1. Jan 26, 2024
  2. Jan 22, 2024
  3. Jan 19, 2024
  4. Jan 18, 2024
  5. Jan 16, 2024
  6. Jan 15, 2024
    • Jérémy Bobbio (Lunar)'s avatar
      Test removal and restore in Kafka · dd419d58
      Jérémy Bobbio (Lunar) authored
      Removing an object from Kafka requires writing a new message with
      the same key as previously used and an empty value. These tombstones
      then get later “compacted” depending on the topic settings `cleanup.policy`,
      `max.compaction.lag.ms` and `delete.retention.ms`.
      
      To test the presence or absence of objects in Kafka, we thus need to
      find which is the most recent: a tombstone or a value. In order to do
      so, we parse all messages into a single dict, associating SWHIDs with
      the latest message timestamp and if it should be considered present or
      absent. This is sadly a bit time and memory consuming but at least we
      get accurate results.
      
      While not strictly necessary, we now use a topic configuration in
      Kafka that will aggressively try to remove “dead” messages. It
      should improve slightly the time needed to inventory objects as
      previously described.
      
      We use the match syntax introduced in Python 3.10 in `handle_message()`,
      so we bump black compatibility settings to Python 3.11.
      
      Depends on swh/devel/swh-alter!7 (and a new release thereafter)
      dd419d58
    • David Douard's avatar
      docker/tests: Add a trailing / in most API requests · 7a44a266
      David Douard authored
      I prevents from a 301 back and forth on each http query.
      7a44a266
    • David Douard's avatar
      docker: do not use sourcehut first when ingesting mercurial origins · 0a5bf710
      David Douard authored
      It seems still broken albeit replying to HEAD queries...
      0a5bf710
    • David Douard's avatar
      docker: increase the health check retries · 8c885bc2
      David Douard authored
      Starting a service can be pretty slow, waiting for dependencies etc.
      Double these to 6 (default being 3) in all service with a healthcheck
      defined with default retries.
      8c885bc2
    • David Douard's avatar
      docker: limit indexer journal client to only origin_intrinsic_metadata · 3095f896
      David Douard authored
      Otherwise the mimetype and fossology indexers are executed, making
      tests very long to execute.
      3095f896
    • David Douard's avatar
      docker: remove the indexer celery worker · 8c6f284f
      David Douard authored
      It's not used any more and is about to be removed from swh-indexer.
      8c6f284f
  7. Jan 11, 2024
  8. Jan 10, 2024
  9. Jan 09, 2024
  10. Jan 03, 2024
  11. Dec 31, 2023
  12. Dec 18, 2023
  13. Dec 11, 2023
    • Jérémy Bobbio (Lunar)'s avatar
      Use pytest base temp dir to write test logs · 1ffd95af
      Jérémy Bobbio (Lunar) authored
      With e293873, compose logs are dumped in case of errors during the test.
      While this is nice, the tests are currently written to
      `docker/tests/logs` directory, with files having a random identifier.
      
      This means they require manual cleaning and it is hard to know which one
      is the most recent. This problem can be solved by using the
      [tmp_path_factory](https://docs.pytest.org/en/7.1.x/reference/reference.html#id47)
      fixture which allow us to create a subdirectory in pytest base temporary
      directory.
      
      The logs are now accessible through the managed
      `-current` symlink, like follows:
      `/tmp/pytest-of-lunar/pytest-current/docker/test_alter.logs`
      1ffd95af
  14. Nov 30, 2023
    • Jérémy Bobbio (Lunar)'s avatar
      Add integration tests for swh-alter · 68fc4dc5
      Jérémy Bobbio (Lunar) authored
      To test the ability of swh-alter to remove objects from multiple
      storage, we need to set up a dedicated environment with – in addition to
      the common PostgreSQL storage – a Cassandra storage fed by a replayer.
      
      swh-alter also needs to get information from swh-graph but the
      compression is currently too fragile for us to rely on the dedicated
      container. Instead, we implement in `alter_companion.py` a trivial mock
      HTTP server that will always return 404 — making swh-alter fallback on
      information from storage.
      
      This server is run in a dedicated swh-alter image. This image will also
      be used to run commands from the same `alter_companion.py` that perform
      queries to the PostgreSQL and Cassandra databases to check the absence
      (or the presence) of a set of SWHIDs.
      
      We can then implement a simple integration scenario where we load
      two origins, check the presence in the storages, remove one of them,
      check the absence in the storages, restore the recovery bundle, and
      check the renewed presence in the storages.
      
      This will later be expanded to check for removal from objstorage(s),
      Kafka, and Elasticsearch.
      68fc4dc5
    • Jérémy Bobbio (Lunar)'s avatar
      Make setup_pip() shell function more resilient · ab75a6c9
      Jérémy Bobbio (Lunar) authored
      It used to crash if there was a `/src` but no directories starting
      with `swh-` in it.
      ab75a6c9
  15. Nov 29, 2023
  16. Nov 22, 2023
    • Antoine Lambert's avatar
      docker: Add optional services to update save requests through webhooks · 7cd58529
      Antoine Lambert authored
      As a first experiment using webhooks in Software Heritage, add optional
      services enabling to update a Save Code Now request status by sending
      push notifications through webhook messages to the webapp.
      
      That feature can be used by initializing the docker environment with the
      following command:
      
      $ docker compose -f docker-compose.yml -f docker-compose.webhooks.yml up -d
      
      In that configuration, the service pulling Save Code Now requests to update
      their status is disabled and a new journal client service is used to forward
      origin visit status events through webhook messages.
      
      The webhooks management relies on the use of Svix: an open source framework
      which offers webhook sending as a service and the new swh-webhooks package
      interacting with the Svix server through its REST API.
      
      Related to swh/meta#4836.
      7cd58529
Loading