replay: overhaul worker thread reporting
- Apr 08, 2025
-
-
Nicolas Dandrimont authored
As the parallelism for replayers uses threads, worker stalls can't be acted upon by the main thread. Writing the max stall duration to a status file allows the container scheduler to act if needed.
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
- Strong typing for report queue messages - Add idle and "object in progress" reporting - Add the ability to request a restart - Restart worker threads when they raise an issue - Print warnings when a thread is detected as stalled (including which object is being processed).
-