RemoteGraphClient.neighbors() should take `max_matching_nodes` instead of `max_edges`
Using the max_edges
parameter with theRemoteGraphClient.neighbors()
method will result in meaningless results. Example:
>>> from swh.graph.http_client import RemoteGraphClient
>>> client = RemoteGraphClient("http://granet.internal.softwareheritage.org:5009/graph")
>>> list(client.neighbors("swh:1:dir:db40b68089ab62e6e8fc885d70d40345793dd9da", direction="backward"))
['swh:1:dir:6190fc14ce86e6a9fee1e6d2b35af740f262362b']
>>> list(client.neighbors("swh:1:dir:db40b68089ab62e6e8fc885d70d40345793dd9da", direction="backward", max_edges=1))
['']
(The fact that it returns a list with an empty string instead of an empty list has been reported as #4790 (closed).)
One would expect to get at least one result when max_edges=1
is specified: one edge is visited, resulting in one neighbor returned.
Using HTTP requests directly give the same result:
$ curl 'http://granet.internal.softwareheritage.org:5009/graph/neighbors/swh:1:dir:db40b68089ab62e6e8fc885d70d40345793dd9da?direction=backward'
swh:1:dir:6190fc14ce86e6a9fee1e6d2b35af740f262362b
$ curl 'http://granet.internal.softwareheritage.org:5009/graph/neighbors/swh:1:dir:db40b68089ab62e6e8fc885d70d40345793dd9da?direction=backward&max_edges=1'
The HTTP server accepts a max_matching_nodes
parameter (which is not documented for /graph/neighbors
) but still gives the intended result:
curl 'http://granet.internal.softwareheritage.org:5009/graph/neighbors/swh:1:dir:db40b68089ab62e6e8fc885d70d40345793dd9da?direction=backward&max_matching_nodes=1'
swh:1:dir:6190fc14ce86e6a9fee1e6d2b35af740f262362b
… but RemoteGraphClient.neighbors()
does not implement it.
Is it meaningful to allow max_edges
for /graph/neighbors
and the .neighbors()
method? If so, is the current behavior correct?