Skip to content

Add an origin blocking proxy

David Douard requested to merge douardda/swh-storage:blocklist into master

This proxy prevent registered origins from being visited again. If an origin url is matching a blocking rule, then any attempt to add an Origin, OriginVisit or OriginVisitStatus object targeting this url will be blocked, raising a BlockedOriginException.

This is implemented in a similar fashion than the MaskingProxy, sharing the same management logic as this later.

The url matching rules are, given a checked URL:

  • check for an exact match in the blocking rules on:

    1. the given URL
    2. the trimmed URL (if it has a trailing /)
    3. the extension-less URL if it ends with a know suffix (eg. '.git')
  • if no exact match is found, look for the best prefix match on split sub-path urls (aka the longest url match in the blocking rules for which the URL starts with the match, splitting on '/')

Notes:

Related to swh/meta#5088

Edited by David Douard

Merge request reports