- Apr 20, 2021
-
-
Jenkins for Software Heritage authored
-
Jenkins for Software Heritage authored
Update to upstream version '0.13.0' with Debian dir 753ba2cc0646d7be73acfff0f55b36cc7fe12e65
-
Antoine R. Dumont authored
Since [1], tasks with priority are routed to dedicated queues (see tasks for more details). The tasks with priority to be scheduled have their own dedicated endpoints to be called. [1] Related to T3084 Related to T3271
-
vlorentz authored
So errors on the CLI side do not trigger an exception on the server
- Apr 15, 2021
-
-
Antoine R. Dumont authored
Related to T3084
-
Jenkins for Software Heritage authored
-
Jenkins for Software Heritage authored
Update to upstream version '0.12.0' with Debian dir 788879e9b7caf6d54d8f4645d720dc7b7859e069
-
Antoine R. Dumont authored
This splits the calls to read tasks into 2 calls, one for tasks with no priority (standard), another call for tasks with priority. If any tasks with priority are detected, they are routed to dedicated `save_code_now:` prefixed named queues (per task type). Related to T3084
-
vlorentz authored
-
- Apr 14, 2021
-
-
Jenkins for Software Heritage authored
-
Jenkins for Software Heritage authored
Update to upstream version '0.11.0' with Debian dir aac6bb6258a43d962787d39b4006589985c90bcb
- Apr 13, 2021
-
-
Antoine R. Dumont authored
The priority notion becomes a blur. Any tasks with a non null priority is considered for reading or grabbing. In a future commit, this should allow to make the runner evolve to reroute tasks with priority to other queues. Related to T3084
-
- Feb 11, 2021
-
-
Nicolas Dandrimont authored
psycopg2.extras.execute_values executes queries in batches of 100 by default. At the end of execute_values, only the last batch of results is available in the cursor; To fetch all results, one needs to set fetch=True instead of using the cursor.
-
Nicolas Dandrimont authored
This allows us to support reading the journal from the beginning, ignoring messages with the old schema.
-
Nicolas Dandrimont authored
The built-in `max` function can take an iterable directly, no need to reimplement it.
-
- Feb 09, 2021
-
-
Vincent Sellier authored
Fix a wrong computation when several messages (>=3) for the same snapshot are received in the wrong order For example, before the fix, the following occurs: ``` | date | snapshot | | last_ev | last_unev | Snap | | ---- | -------- | --- | -------- | --------- | ---- | | 2022 | S2 | | 2022 | | S2 | | 2020 | S2 | | 2020 | 2022 | S2 | | 2021 | S2 | | **2021** | **2020** | S2 | ``` as it should be: ``` | date | snapshot | | last_ev | last_unev | Snap | | ---- | -------- | --- | -------- | --------- | ---- | | 2022 | S2 | | 2022 | | S2 | | 2020 | S2 | | 2020 | 2022 | S2 | | 2021 | S2 | | **2020** | **2022** | S2 | ``` Related to T3000
-
- Feb 05, 2021
-
-
Antoine R. Dumont authored
As loader will start to create failed status message, deal with them if any. Related to T3030
-
- Feb 03, 2021
-
-
Jenkins for Software Heritage authored
-
Jenkins for Software Heritage authored
Update to upstream version '0.10.0' with Debian dir 17cb8a0e3def3b15efe5ce2e2ae36c621314e1f3
-
Nicolas Dandrimont authored
With late acknowledgements, RabbitMQ will re-send tasks to clients even if they can't ever complete the task (e.g. when the task gets killed because the machine is out of memory). This problem only increases over time, leading to complete starvation of the ingestion system. Now that we have multiple mechanisms to issue retries of tasks, we can use early acknowledgements for tasks instead, which should mitigate the ongoing starvation, at the expense of having to retry tasks externally.
- Feb 01, 2021
-
-
David Douard authored
-
David Douard authored
-
- Jan 29, 2021
-
-
David Douard authored
-
- Jan 26, 2021
-
-
We already do that in the scheduler backend function
-
-
This allows us to check the behavior of the archive over time in terms of number of visits.
-
This was a significant bottleneck of the simulator. To work around this, we: - Generate snapshot ids consistently in the OriginModel - Cache the origin data locally in the simulator, to compute the eventfulness of visits - Cache the last visit time for all origins to compute the estimated run time of visit tasks.
-
The earlier implementation would just schedule new visits for origins forever, regardless of whether they were already scheduled or not.
-