Implement a traversal method returning nodes from external inbound edges
To optimize removal of objects from the archive, we currently look up swh.graph
before querying swh.storage
. The algorithm is currently done in two steps:
- Get the list of candidates for removal, which is a complete subgraph of the Software Heritage archive rooted at the requested object.
- For all the candidates for removal, check if they are referenced by any other object in the archive.
To further optimize the implementation, we could have the initial BFS made with swh.graph
also returns a list of external inbound edges. A node with external inbound edges most likely cannot be removed, but all these predecessors might have been removed from the archive since swh.graph
export. So we need this list to check against swh.storage
if at least one node is still present in the archive to confirm they are free of external references.