Skip to content

Add fulltext search on origin intrinsic metadata.

I made several choices while writing that Diff that are open to discussion:

  • Using GIN instead of GiST. That seems the most appropriate choice when reading https://www.postgresql.org/docs/9.1/textsearch-indexes.html ; but these part got removed from the doc of pgsql 10: https://www.postgresql.org/docs/10/textsearch-indexes.html
  • Using pg_catalog.simple as dictionary. Since we're dealing with any language and proper names, it seemed best to use a dictionary with no stop word. Though, arguably, most of the data will be English, and stop words usually don't appear in names.
  • It only supports conjunctions of search terms. I could easily add support for arbitrary levels of nestings and disjunctions/negations. That can be done later if we deem it worth it.
  • It indexes JSON keys too. It is probably possible to fix this, at the expanse of complicated SQL code, or some postprocessing in Python.

Resolves T1334 and T1335.

Test Plan

tox


Migrated from D658 (view on Phabricator)

Merge request reports