- Jan 11, 2022
-
- Jan 10, 2022
-
-
Vincent Sellier authored
Related to T3838
-
- Dec 16, 2021
-
-
Antoine R. Dumont authored
This also: - drops spurious copyright headers to those files if present. - add missing iso8601 runtime dependency Related to T3812
-
- Dec 13, 2021
-
-
Antoine Lambert authored
Since we are not using the elasticsearch startup script, we must explicitely set the LIBFFI_TMPDIR environment variable or elasticsearch will fail to start. See https://www.elastic.co/guide/en/elasticsearch/reference/current/executable-jna-tmpdir.html Closes T3803
-
- Nov 23, 2021
-
-
Antoine R. Dumont authored
This fixes build [1] [1] https://jenkins.softwareheritage.org/view/swh-draft/job/DSEA/job/tests/972/console
-
- Nov 19, 2021
-
-
Antoine Pietri authored
-
- Oct 26, 2021
-
-
Antoine Lambert authored
Some date values that can be found in codemeta.json files (dateCreated, dateModified, datePublished) might be in a format not parsable by elasticsearch which prevents successfull update of origin intrinsic metadata in elasticsearch indices. For instance, the date 2021-7-23 cannot be parsed by elasticsearch as it expects 2021-07-23 instead. So ensure to properly format CodeMeta dates to avoid such indexing errors.
-
- Sep 29, 2021
-
-
Antoine Lambert authored
New visit types are or will be available in production so we must add them here or we will have an error when searching such origins. Related to T3424
-
- Sep 28, 2021
-
-
Antoine Lambert authored
mypy started to detect an error in that function implementation and it is used nowhere in swh codebase so better removing it.
-
Antoine Lambert authored
Python elasticsearch module forbids use of positional arguments in its latest release (7.15.0) in favor of keyword arguments only.
-
Antoine Lambert authored
Since rDMODf56becc196ed6dd4b211c97096654c4400b047ec, an error is raised when calling that function with a dict containing unexpected keys. So use recommended way to get origin identifier since the function deprecation.
-
Antoine Lambert authored
Side effect of rDMOD57ae405d312879bec19107d29a20c2c290d7861d
-
- Sep 08, 2021
- Sep 07, 2021
-
-
vlorentz authored
This should resolve T3562.
-
- Sep 03, 2021
-
-
Antoine Lambert authored
When installing swh-search in develop mode, tree-sitter node module must be installed in order to generate the parser C file.
-
- Sep 02, 2021
-
-
Antoine Lambert authored
Tree-sitter parser compilation is now handled in setup.py or directly in the code so that file can be removed. It also fixes make test invocation.
-
- Aug 31, 2021
-
-
vlorentz authored
- Aug 26, 2021
-
-
vlorentz authored
.so is still built, but only for binary distributions (it can be built from a source distribution). To do this, we now include intermediary files (parser.c in particular) in the source distribution. They are not really source file, but this allows to remove the dependency on a nodejs runtime to install from pypi.
-
Antoine Lambert authored
It enables to return the origin counts per visit type. It also enables to get all available visit types dynamically in other components like swh-web. The underlying elasticsearch query has been tested on production cluster and it is pretty efficient. Related to T3441.
- Aug 19, 2021
-
-
Antoine Lambert authored
The swh_sql.so file must be built prior running the tests.
-
- Aug 18, 2021
-
-
Kumar Shivendu authored
- Export grammar tokens through tokens.js file (so that swh-web can use them) - Improvements in the grammar to better handle sort_by and limit - Introduce annotateFilter function in grammar.js to assign tree-sitter fields to different parts of a filter (field, op, value)
-
- Aug 17, 2021
-
-
Kumar Shivendu authored
Add support for rpc server in swh-search cli (Like other swh services)
-
- Aug 16, 2021
-
- Aug 13, 2021
-
-
vlorentz authored
and rewrite them in Python, with error checking (instead of failing silently)
-
Vincent Sellier authored
Related to T3484
-
- Aug 09, 2021
-
- Aug 06, 2021
-
-
Kumar Shivendu authored
Integrate the query language translator in the Elasticsearch implementation
-
- Jul 30, 2021
-
-
vlorentz authored
Putting it in a subdirectory that isn't a subpackage should make it undiscoverable.
-
Kumar Shivendu authored
Translate swh search query language queries into Elasticsearch DSL
-
- Jul 28, 2021
-
-
vlorentz authored
Like this: ``` ql_rel_paths = [ "swh_ql.so", # installed "../../query_language/swh_ql.so", # development ] for ql_rel_path in ql_rel_paths: ql_path = resource_filename("swh.search", ql_rel_path) if os.path.exists(ql_path): break else: assert False, 'not found' search_ql = Language(ql_path, "swh_search_ql") ``` `data_files` is not designed to be accessed from the same Python package, but to write files in standard locations (typically `.desktop` files in /usr/share) that other packages read with their own discovery mechanisms.
-
vlorentz authored
-
- Jul 26, 2021
-
-
Kumar Shivendu authored
The grammar should not allow using sort_by and limit more than once throughout the query. Unlike other filters, these two must not be concatenated by 'and' or 'or'
-
Kumar Shivendu authored
This revision defines the grammar for the search query language and prepares swh.search for a smoother development of the grammar. The parsers generated from the proposed grammar serve two different purposes: - Translation of search queries into elasticsearch DSL in swh.search (or any other search backend that we may use in the future) - Autocompletion of the queries in the swh.web (Archive UI) tree-sitter has been selected for the task because it has bindings for python (swh.search) as well as wasm (swh.web).
-
- Jul 23, 2021
-
-
Nicolas Dandrimont authored
Sometimes, in a very loaded situation, the producer can return and let the consumer start before the topic is actually created. Adding a `producer.flush()` avoids that race condition.
-
- Jul 22, 2021
-
-
Kumar Shivendu authored
Documentation for the proposed search query language
-
- Jul 21, 2021
-
-
Nicolas Dandrimont authored
The origin_visit_status topic now contains the `type` key, which is all the information that we used in the origin_visit topic; We can stop processing that topic altogether.
-
Nicolas Dandrimont authored
-