As usual, I first tried to do that with the whole schema enabled, including indexes, which makes the initial sync take forever.
This morning cleaned up the replica database, removed all indexes and launched the initial sync again (which is already farther along than it was after 15 hours of work...)
Looks like the replica is now running properly and has remained up to date with the primary for a week (including the weekend's backup window).
And indeed the bootstrap steps for the replica don't seem documented, so let's summarize:
install the latest postgresql version with reasonable settings (effective_cache_size, shared_buffers etc.). Some relevant settings for logical replication since pg 16 are
max_logical_replication_workers = 16 # taken from max_worker_processesmax_sync_workers_per_subscription = 6 # taken from max_logical_replication_workersmax_parallel_apply_workers_per_subscription = 8 # taken from max_logical_replication_workers
create an empty database with the read_replica flavor using swh tooling. I got a bit lost between swh db create, swh db init-admin and swh db init, but eventually I ended up with an appropriate database with an empty schema.
I think I ended up using something along the lines of:
The primary server will accumulate WALs for the whole duration of every single table sync, so you need to be ready to accommodate that (this is especially problematic if a backup is running, as that increases WAL traffic significantly).
monitor the initial sync of tables; The progress indicator here is table size, so it's a bit of a wild guess. Overall without indexes the initial sync takes a few days, the content and directory tables take the longest
select pg_stat_subscription.*, relname from pg_stat_subscription left join pg_class on pg_class.oid = relid;
notice that indexes are making the initial replication take forever, and re-do steps 2-4 dropping the index creation script (60-indexes.sql) in swh db init
When every table finishes syncing, create the associated indexes by running the 60-indexes.sql commands manually. Doing all the create index commands for the same table in parallel allows postgresql to read the data only once (but, counterintuitively, that only works for create index without concurrently.
You can monitor index creation progress with :
select pid, pg_stat_progress_create_index.datname, command, phase, case blocks_total when 0 then 0 else (blocks_done * 100.0) / (blocks_total * 1.0) end as blocks_pc, case tuples_total when 0 then 0 else (tuples_done * 100.0) / (tuples_total * 1.0) end as tuples_pc,relname, query from pg_stat_progress_create_index left join pg_class on pg_class.oid = relid left join pg_stat_activity using (pid);