Skip to content
Snippets Groups Projects
Commit 6fe7f89e authored by Antoine Lambert's avatar Antoine Lambert Committed by Antoine Lambert
Browse files

api: Remove visit_types field in origins endpoint responses

The recent addition of the visit_types field in origins endpoint responses
significantly degraded the performance of the HTTP requests as visit types
info are fetched on the backend side by sending requests to elasticsearch
for each origin in a page.

It has been observed by an endpoint client (OpenAIRE) that requests could
fail by returning read timeout errors related to elasticsearch not responding
in time when fetching visit types info.

So revert the addition of that field in order to significantly speedup response
time as requests to elasticsearch will no longer be made by the backend.
parent d3780a19
No related branches found
No related tags found
1 merge request!1321api: Remove visit_types field in origins endpoint responses
Pipeline #11086 passed
......@@ -30,13 +30,17 @@ DOC_RETURN_ORIGIN = """
:http:get:`/api/1/raw-extrinsic-metadata/swhid/(target)/authorities/`
to get the list of metadata authorities providing extrinsic metadata
on this origin (and, indirectly, to the origin's extrinsic metadata itself)
:>json array visit_types: set of visit types for that origin
:>json boolean has_visits: indicates if Software Heritage made at least one full
visit of the origin
"""
DOC_RETURN_ORIGIN_ARRAY = DOC_RETURN_ORIGIN.replace(":>json", ":>jsonarr")
DOC_RETURN_ORIGIN += (
" :>json array visit_types: set of visit types for that origin"
)
DOC_RETURN_ORIGIN_VISIT = """
:>json string date: ISO8601/RFC3339 representation of the visit date (in UTC)
:>json str origin: the origin canonical url
......@@ -82,6 +86,7 @@ def api_origins(request: Request):
:query int origin_count: The maximum number of origins to return
(default to 100, cannot exceed 10000)
{return_origin_array}
{common_headers}
......@@ -105,6 +110,7 @@ def api_origins(request: Request):
limit = min(int(request.query_params.get("origin_count", "100")), 10000)
page_result = archive.lookup_origins(page_token, limit)
origins = [enrich_origin(o, request=request) for o in page_result.results]
next_page_token = page_result.next_page_token
......
......@@ -192,10 +192,11 @@ def lookup_content_license(q):
return converters.from_swh(lic, hashess={"id"})
def _origin_info(origin: Origin) -> OriginInfo:
def _origin_info(origin: Origin, with_visit_types: bool = True) -> OriginInfo:
origin_dict = origin.to_dict()
origin_data = search.origin_get(origin.url) or {}
origin_dict["visit_types"] = list(origin_data.get("visit_types", []))
if with_visit_types:
origin_data = search.origin_get(origin.url) or {}
origin_dict["visit_types"] = list(origin_data.get("visit_types", []))
return converters.from_origin(origin_dict)
......@@ -253,7 +254,7 @@ def lookup_origins(
"""
page = storage.origin_list(page_token=page_token, limit=limit)
return PagedResult(
[_origin_info(o) for o in page.results],
[_origin_info(o, with_visit_types=False) for o in page.results],
next_page_token=page.next_page_token,
)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment