Project 'infra/sysadm-environment' was moved to 'swh/infra/sysadm-environment'. Please update any links and bookmarks that may still have the old path.
Rectification : kafka is installed on the node but it seems the configuration is not complete
[2020-11-17 16:29:43,971] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)[2020-11-17 16:29:44,426] INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)[2020-11-17 16:29:44,446] ERROR Exiting Kafka due to fatal exception (kafka.Kafka$)java.lang.IllegalArgumentException: Error creating broker listeners from 'PLAINTEXT://journal0.internal.staging.swh.network:': Unable to parse PLAINTEXT://journal0.internal.staging.swh.network: to a broker endpoint at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:268) at kafka.server.KafkaConfig.$anonfun$listeners$1(KafkaConfig.scala:1633) at kafka.server.KafkaConfig.listeners(KafkaConfig.scala:1632) at kafka.server.KafkaConfig.advertisedListeners(KafkaConfig.scala:1660) at kafka.server.KafkaConfig.validateValues(KafkaConfig.scala:1731) at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:1709) at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:1273) at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:34) at kafka.Kafka$.main(Kafka.scala:68) at kafka.Kafka.main(Kafka.scala)Caused by: org.apache.kafka.common.KafkaException: Unable to parse PLAINTEXT://journal0.internal.staging.swh.network: to a broker endpoint at kafka.cluster.EndPoint$.createEndPoint(EndPoint.scala:57) at kafka.utils.CoreUtils$.$anonfun$listenerListToEndPoints$6(CoreUtils.scala:265) at scala.collection.StrictOptimizedIterableOps.map(StrictOptimizedIterableOps.scala:99) at scala.collection.StrictOptimizedIterableOps.map$(StrictOptimizedIterableOps.scala:86) at scala.collection.mutable.ArraySeq.map(ArraySeq.scala:38) at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:265) ... 9 more
The topics were created with 64 partitions and a replication factor of 1 :
for object_type in content skipped_content directory revision release snapshot origin origin_visit origin_visit_status raw_extrinsic_metadata metadata_fetcher metadata_authority; do ./kafka-topics.sh --bootstrap-server journal0.internal.staging.swh.network:9092 --create --config cleanup.policy=compact --partitions 64 --replication-factor 1 --topic swh.journal.objects.$object_typedonefor object_type in revision release; do ./kafka-topics.sh --bootstrap-server journal0.internal.staging.swh.network:9092 --create --config cleanup.policy=compact --partitions 64 --replication-factor 1 --topic swh.journal.objects_privileged.$object_typedone
Logs:
root@journal0:/opt/kafka/bin# for object_type in content skipped_content directory revision release snapshot origin origin_visit origin_visit_status raw_extrinsic_metadata metadata_fetcher metadata_authority; do> ./kafka-topics.sh --bootstrap-server journal0.internal.staging.swh.network:9092 --create --config cleanup.policy=compact --partitions 64 --replication-factor 1 --topic swh.journal.objects.$object_type> doneWARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.content.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.skipped_content.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.directory.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.revision.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.release.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.snapshot.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.origin.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.origin_visit.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.origin_visit_status.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.raw_extrinsic_metadata.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.metadata_fetcher.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects.metadata_authority.
root@journal0:/opt/kafka/bin# for object_type in revision release; do> ./kafka-topics.sh --bootstrap-server journal0.internal.staging.swh.network:9092 --create --config cleanup.policy=compact --partitions 64 --replication-factor 1 --topic swh.journal.objects_privileged.$object_type> doneWARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects_privileged.revision.WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.Created topic swh.journal.objects_privileged.release.
I have some doubts on how to import the following object types and if they need to :
swh.journal.objects.metadata_authority
swh.journal.objects.metadata_fetcher
swh.journal.objects.raw_extrinsic_metadata
swh.journal.objects_privileged.release
swh.journal.objects_privileged.revision
origin
swhstorage@storage1:~$ SWH_CONFIG_FILENAME=storage.yml swh storage backfill --start-object=0 --end-object=34500INFO:swh.storage.backfill:Processing origin range None to 1000 INFO:swh.storage.backfill:Processing origin range 1000 to 2000 INFO:swh.storage.backfill:Processing origin range 2000 to 3000 INFO:swh.storage.backfill:Processing origin range 3000 to 4000 ...INFO:swh.storage.backfill:Processing origin range 33000 to 34000INFO:swh.storage.backfill:Processing origin range 34000 to 34500real 0m11.536suser 0m3.260ssys 0m0.982s
origin_visit
swhstorage@storage1:~$ time SWH_CONFIG_FILENAME=storage.yml swh storage backfill origin_visit --end-object=35000INFO:swh.storage.backfill:Processing origin_visit range 34000 to 35000real 0m16.783suser 0m8.253ssys 0m1.191s
origin_visit_status
# after patching line 507 of backfill.pytime SWH_CONFIG_FILENAME=storage.yml swh storage backfill origin_visit_status --end-object=35000...INFO:swh.storage.backfill:Processing origin_visit_status range 32000 to 33000 INFO:swh.storage.backfill:Processing origin_visit_status range 33000 to 34000 INFO:swh.storage.backfill:Processing origin_visit_status range 34000 to 35000 real 0m17.936suser 0m12.551ssys 0m1.120s
skipped_content
swhstorage@storage1:~$ time SWH_CONFIG_FILENAME=storage.yml swh storage backfill skipped_contentINFO:swh.storage.backfill:Processing skipped_content range None to Nonereal 0m0.590suser 0m0.487ssys 0m0.064s
release
swhstorage@storage1:~$ time SWH_CONFIG_FILENAME=storage.yml swh storage backfill release --start-object=0 --end-object ffff...INFO:swh.storage.backfill:Processing release range fffe to ffffINFO:swh.storage.backfill:Processing release range ffff to Nonereal 1m16.421suser 0m19.655ssys 0m3.622s
snapshot
swhstorage@storage1:~$ time SWH_CONFIG_FILENAME=storage.yml swh storage backfill snapshot --end-object ffff...INFO:swh.storage.backfill:Processing snapshot range fffe to ffffINFO:swh.storage.backfill:Processing snapshot range ffff to Nonereal 2m30.118suser 0m31.171ssys 0m9.037s
revision
swhstorage@storage1:~$ time SWH_CONFIG_FILENAME=storage.yml swh storage backfill revision --end-object ffffff...INFO:swh.storage.backfill:Processing revision range fffffd to fffffeINFO:swh.storage.backfill:Processing revision range fffffe to ffffffINFO:swh.storage.backfill:Processing revision range ffffff to Nonereal 435m7.953suser 104m50.847ssys 19m12.365s
content
swhstorage@storage1:~$ time SWH_CONFIG_FILENAME=storage.yml swh storage backfill revision --end-object ffffffINFO:swh.storage.backfill:Processing content range fffffb to fffffcINFO:swh.storage.backfill:Processing content range fffffc to fffffdINFO:swh.storage.backfill:Processing content range fffffd to fffffeINFO:swh.storage.backfill:Processing content range fffffe to ffffffINFO:swh.storage.backfill:Processing content range ffffff to Nonereal 845m39.746s user 213m26.057s sys 53m32.150s
directory
swhstorage@storage1:~$ time SWH_CONFIG_FILENAME=storage.yml swh storage backfill directory --end-object ffffff...INFO:swh.storage.backfill:Processing directory range fffffd to fffffeINFO:swh.storage.backfill:Processing directory range fffffe to ffffffINFO:swh.storage.backfill:Processing directory range ffffff to Nonereal 1326m38.221suser 560m44.543ssys 58m21.216s
the backfilling is complete (except for the metadatas). We will focus now on some clients to ensure all the local configuration is correct (T2814 for example), and then we will focus on exposing kafka to the outside.