[swh-storage] Lot of kafka timeouts
There are a lot of kafka timeouts on the saam's storage logs.
It generates a lot of errors on the client side
Apr 17 16:46:53 saam python3[3626799]: 2023-04-17 16:46:53 [3626799] swh.journal.writer.kafka:ERROR FAIL [swh.storage.journal_writer.saam#producer-1] [thrd:kafka2.internal.softwareheritage.org:9092/bootstrap]: kafka2.internal.softwareheritage.org:9092/2: 14 request(s) timed out: disconnect (after 865774ms in state UP)
Apr 17 16:46:53 saam python3[3626799]: 2023-04-17 16:46:53 [3626799] swh.journal.writer.kafka:INFO Received non-fatal kafka error: KafkaError{code=_TIMED_OUT,val=-185,str="kafka2.internal.softwareheritage.org:9092/2: 14 request(s) timed out: disconnect (after 865774ms in state UP)"}
Apr 17 16:46:56 saam python3[2772757]: 2023-04-17 16:46:56 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka2.internal.softwareheritage.org:9092/bootstrap]: kafka2.internal.softwareheritage.org:9092/2: Timed out ProduceRequest in flight (after 60015ms, timeout #0)
Apr 17 16:46:56 saam python3[2772757]: 2023-04-17 16:46:56 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka2.internal.softwareheritage.org:9092/bootstrap]: kafka2.internal.softwareheritage.org:9092/2: Timed out ProduceRequest in flight (after 60015ms, timeout #1)
Apr 17 16:46:56 saam python3[2772757]: 2023-04-17 16:46:56 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka2.internal.softwareheritage.org:9092/bootstrap]: kafka2.internal.softwareheritage.org:9092/2: Timed out ProduceRequest in flight (after 60015ms, timeout #2)
Apr 17 16:46:56 saam python3[2772757]: 2023-04-17 16:46:56 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka2.internal.softwareheritage.org:9092/bootstrap]: kafka2.internal.softwareheritage.org:9092/2: Timed out ProduceRequest in flight (after 60015ms, timeout #3)
Apr 17 16:46:56 saam python3[2772757]: 2023-04-17 16:46:56 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka2.internal.softwareheritage.org:9092/bootstrap]: kafka2.internal.softwareheritage.org:9092/2: Timed out ProduceRequest in flight (after 60015ms, timeout #4)
Apr 17 16:46:56 saam python3[2772757]: 2023-04-17 16:46:56 [2772757] swh.journal.writer.kafka:WARNING REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka2.internal.softwareheritage.org:9092/bootstrap]: kafka2.internal.softwareheritage.org:9092/2: Timed out 61 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
Apr 17 16:46:56 saam python3[2772757]: 2023-04-17 16:46:56 [2772757] swh.journal.writer.kafka:ERROR FAIL [swh.storage.journal_writer.saam#producer-1] [thrd:kafka2.internal.softwareheritage.org:9092/bootstrap]: kafka2.internal.softwareheritage.org:9092/2: 61 request(s) timed out: disconnect (after 1527431ms in state UP)
Apr 17 16:46:56 saam python3[2772757]: 2023-04-17 16:46:56 [2772757] swh.journal.writer.kafka:INFO Received non-fatal kafka error: KafkaError{code=_TIMED_OUT,val=-185,str="kafka2.internal.softwareheritage.org:9092/2: 61 request(s) timed out: disconnect (after 1527431ms in state UP)"}
Apr 17 16:46:57 saam python3[2772757]: 2023-04-17 16:46:57 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka1.internal.softwareheritage.org:9092/bootstrap]: kafka1.internal.softwareheritage.org:9092/1: Timed out ProduceRequest in flight (after 60964ms, timeout #0)
Apr 17 16:46:57 saam python3[2772757]: 2023-04-17 16:46:57 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka1.internal.softwareheritage.org:9092/bootstrap]: kafka1.internal.softwareheritage.org:9092/1: Timed out ProduceRequest in flight (after 60964ms, timeout #1)
Apr 17 16:46:57 saam python3[2772757]: 2023-04-17 16:46:57 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka1.internal.softwareheritage.org:9092/bootstrap]: kafka1.internal.softwareheritage.org:9092/1: Timed out ProduceRequest in flight (after 60964ms, timeout #2)
Apr 17 16:46:57 saam python3[2772757]: 2023-04-17 16:46:57 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka1.internal.softwareheritage.org:9092/bootstrap]: kafka1.internal.softwareheritage.org:9092/1: Timed out ProduceRequest in flight (after 60964ms, timeout #3)
Apr 17 16:46:57 saam python3[2772757]: 2023-04-17 16:46:57 [2772757] swh.journal.writer.kafka:INFO REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka1.internal.softwareheritage.org:9092/bootstrap]: kafka1.internal.softwareheritage.org:9092/1: Timed out ProduceRequest in flight (after 60964ms, timeout #4)
Apr 17 16:46:57 saam python3[2772757]: 2023-04-17 16:46:57 [2772757] swh.journal.writer.kafka:WARNING REQTMOUT [swh.storage.journal_writer.saam#producer-1] [thrd:kafka1.internal.softwareheritage.org:9092/bootstrap]: kafka1.internal.softwareheritage.org:9092/1: Timed out 70 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
Apr 17 16:46:57 saam python3[2772757]: 2023-04-17 16:46:57 [2772757] swh.journal.writer.kafka:ERROR FAIL [swh.storage.journal_writer.saam#producer-1] [thrd:kafka1.internal.softwareheritage.org:9092/bootstrap]: kafka1.internal.softwareheritage.org:9092/1: 70 request(s) timed out: disconnect (after 2140840ms in state UP)
Apr 17 16:46:57 saam python3[2772757]: 2023-04-17 16:46:57 [2772757] swh.journal.writer.kafka:INFO Received non-fatal kafka error: KafkaError{code=_TIMED_OUT,val=-185,str="kafka1.internal.softwareheritage.org:9092/1: 70 request(s) timed out: disconnect (after 2140840ms in state UP)"}
192.168.100.39 - - [17/Apr/2023:16:28:52 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.25.1"
192.168.100.24 - - [17/Apr/2023:16:28:52 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.25.1"
192.168.100.38 - - [17/Apr/2023:16:28:54 +0000] "POST /origin/visit_status/add HTTP/1.1" 503 2594 "-" "python-requests/2.25.1"
192.168.100.35 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/visit/add HTTP/1.1" 503 2594 "-" "python-requests/2.25.1"
192.168.100.27 - - [17/Apr/2023:16:28:57 +0000] "POST /extid/add HTTP/1.1" 503 2129 "-" "python-requests/2.25.1"
192.168.100.28 - - [17/Apr/2023:16:28:57 +0000] "POST /directory/add HTTP/1.1" 503 2178 "-" "python-requests/2.25.1"
192.168.100.42 - - [17/Apr/2023:16:28:57 +0000] "POST /raw_extrinsic_metadata/add HTTP/1.1" 503 2256 "-" "python-requests/2.25.1"
192.168.100.37 - - [17/Apr/2023:16:28:57 +0000] "POST /directory/add HTTP/1.1" 503 2710 "-" "python-requests/2.25.1"
192.168.100.28 - - [17/Apr/2023:16:28:57 +0000] "POST /directory/add HTTP/1.1" 503 2178 "-" "python-requests/2.25.1"
192.168.100.26 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/visit_status/add HTTP/1.1" 503 2619 "-" "python-requests/2.25.1"
192.168.100.37 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/add_multi HTTP/1.1" 503 2260 "-" "python-requests/2.25.1"
192.168.100.35 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/visit/add HTTP/1.1" 503 2613 "-" "python-requests/2.25.1"
192.168.100.24 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/visit/add HTTP/1.1" 503 2458 "-" "python-requests/2.25.1"
192.168.100.26 - - [17/Apr/2023:16:28:57 +0000] "POST /extid/add HTTP/1.1" 503 2129 "-" "python-requests/2.25.1"
192.168.100.39 - - [17/Apr/2023:16:28:57 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.25.1"
192.168.100.132 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/add_multi HTTP/1.1" 503 2224 "-" "python-requests/2.28.2"
192.168.100.26 - - [17/Apr/2023:16:28:57 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.25.1"
192.168.100.132 - - [17/Apr/2023:16:28:57 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.28.2"
192.168.100.132 - - [17/Apr/2023:16:28:57 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.28.2"
192.168.100.37 - - [17/Apr/2023:16:28:57 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.25.1"
192.168.100.41 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/visit_status/add HTTP/1.1" 503 2785 "-" "python-requests/2.25.1"
192.168.100.27 - - [17/Apr/2023:16:28:57 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.25.1"
192.168.100.38 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/visit_status/add HTTP/1.1" 503 2664 "-" "python-requests/2.25.1"
192.168.100.40 - - [17/Apr/2023:16:28:57 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.25.1"
192.168.100.23 - - [17/Apr/2023:16:28:57 +0000] "POST /origin/visit_status/add HTTP/1.1" 503 2733 "-" "python-requests/2.25.1"
192.168.100.38 - - [17/Apr/2023:16:28:57 +0000] "POST /content/add HTTP/1.1" 503 2011 "-" "python-requests/2.25.1"
192.168.100.26 - - [17/Apr/2023:16:28:58 +0000] "POST /directory/add HTTP/1.1" 503 2178 "-" "python-requests/2.25.1"
192.168.100.35 - - [17/Apr/2023:16:28:58 +0000] "POST /origin/visit/add HTTP/1.1" 503 2446 "-" "python-requests/2.25.1"
192.168.100.36 - - [17/Apr/2023:16:28:58 +0000] "POST /origin/visit/add HTTP/1.1" 503 2537 "-" "python-requests/2.25.1"
192.168.100.36 - - [17/Apr/2023:16:28:58 +0000] "POST /metadata_authority/add HTTP/1.1" 503 2243 "-" "python-requests/2.25.1"
192.168.100.28 - - [17/Apr/2023:16:28:58 +0000] "POST /content/add HTTP/1.1" 503 2011 "-" "python-requests/2.25.1"
192.168.100.21 - - [17/Apr/2023:16:28:59 +0000] "POST /directory/add HTTP/1.1" 503 2178 "-" "python-requests/2.25.1"
192.168.100.23 - - [17/Apr/2023:16:28:59 +0000] "POST /content/add HTTP/1.1" 503 2011 "-" "python-requests/2.25.1"
192.168.100.36 - - [17/Apr/2023:16:28:59 +0000] "POST /content/add HTTP/1.1" 503 2011 "-" "python-requests/2.25.1"
192.168.100.26 - - [17/Apr/2023:16:29:02 +0000] "POST /content/add HTTP/1.1" 503 6936 "-" "python-requests/2.25.1"