Skip to content

Make the copy process of blob objects run with thread concurrency

for each batch of messages, dispatch the copy of individual objects in a ThreadPoolExecutor. The idea is to allow concurrency to ge beyong process parallelism provided by kafka consumer groups. Since the copy a one object is mainly IO bound (check existence in destination objstare, retrieve from source objstorage, put in destination objstorage) with possibly large delays (eg. retrieveing a blob from s3 imply a minimum 150/200ms delay before the HTTP request is replied, whatever the size of the object); this tries to parallelize those delays.


Migrated from D6945 (view on Phabricator)

Merge request reports