Skip to content

Clean up raw_extrinsic_metadata table

It's currently holding information we migrated away from (loaders now write metadata on revision objects instead of snapshots, swh/devel/swh-loader-core!164 (closed) swh/devel/swh-loader-core!336 (closed)).

The table is also big for a fresh new table (~2 months old), that won't be sustainable.

So here is the new plan (initial plan [1]):

  • Drop raw_extrinsic_metadata from the replication subscription

  • Then apply $843

  • [1]

> delete from raw_extrinsic_metadata 
where id like 'swh:1:snp:%' 
  and  (format = 'replicate-npm-package-json'
           or format = 'pypi-project-json'
  );
> explain ...
 Delete on raw_extrinsic_metadata  (cost=0.00..2366.00 rows=1 width=6)
   ->  Seq Scan on raw_extrinsic_metadata  (cost=0.00..2366.00 rows=1 width=6)
         Filter: ((id ~~ 'swh:1:snp:%'::text) AND ((format = 'replicate-npm-package-json'::text) OR (format = 'pypi-project-json'::text)))
(3 rows)

Migrated from T2749 (view on Phabricator)

Edited by Phabricator Migration user
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information