Remove deprecated documentation

This documentation is duplicated on the main readme or does not match the way a mirror is now deployed. Related to T3829

Remove deprecated documentation
This documentation is duplicated on the main readme or does not match the way a mirror is now deployed. Related to T3829
bfcc0229 · Vincent Sellier · 0526bf35 · 0526bf35 · 0526bf35 · 0526bf35
Verified Commit bfcc0229 authored 3 years ago by Vincent Sellier
--- a/docs/install-objstorage.rst
+++ b/docs/install-objstorage.rst
-.. highlight:: bash
-.. _objstorage_install:
-Object Storage
-==============
-The machine that hosts the Object Storage must have access to enough storage so
-the whole content of the Archive can be copied. It should be at least 300TiB
-(as of today, the whole content of the archive represent around 200TiB of
-compressed data).
-The object storage does not require a database, however, it needs a way to
-store objects (blobs).
-There are several backends currently available:
- `pathslicing`: store objects in a POSIX filesystem
- `remote`: use a HTTP based RPC exposing an existing objstorage,
- `pathslicing`: store objects in a POSIX filesystem
- `azure`: use Azure's storage,
- `azure-prefixed`: Azure's storage with a prefix; typically used in
-  conjunction with the `multiplexer` backend (see below) to distribute the
-  storage amoung a set of Azure tokens for better performances,
- `memory`: keep everything in RAM, for testing purpose,
- `weed`: use seaweedfs as blob storage,
- `rados`: RADOS based object storage (Ceph),
- `s3`: Amazon S3 storage,
- `swift`: OpensStack Swift storage.
- `multiplexer`: assemble several objstorages as once.
- `striping`: xxx
- `filtered`: xxx
-Beware that not all these backends are production ready.
-Please read the documentation of each of these backends for more details on how
-to configure them.
-Docker based deployment
-----------------------
-In a docker based deployment, all machine names must be resolvable from within
-a docker container and accessible from there.
-When testing this guide on a single docker host, the simplest solution is to
-start your docker containers linked to a common bridge::
-  $ docker network create swh
-  e0d85947d4f53f8b2f0393517f373ab4f5b06d02e1efa07114761f610b1f7afa
-  $
-In the examples below we will use such a network config.
-Build the image
-~~~~~~~~~~~~~~~
-```
-$ docker build -t swh/base https://forge.softwareheritage.org/source/swh-docker.git
-```
-Configure the object storage
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In this example, we use a local storage as backend located in
-`/srv/softwareheritage/objects`. Write a configuration file named
-`objstorage.yml`::
-  objstorage:
-    cls: pathslicing
-    args:
-      root: /srv/softwareheritage/objects
-      slicing: 0:5
-  client_max_size: 1073741824
-Testing the configuration
-~~~~~~~~~~~~~~~~~~~~~~~~~
-Then start the test SWGI server for the RPC objstorage service::
-  $ docker run --rm  \
-      --network swh \
-      -v ${PWD}/objstorage.yml:/etc/softwareheritage/config.yml \
-      -v /srv/softwareheritage/objects:/srv/softwareheritage/objects \
-      -p 5003:5000 \
-      swh/base objstorage
-You should be able to query this server (fraom another terminal since the
-container is started attached to its console)::
-  $ http http://127.0.0.1:5003
-  HTTP/1.1 200 OK
-  Content-Length: 25
-  Content-Type: text/plain; charset=utf-8
-  Date: Thu, 20 Jun 2019 08:02:32 GMT
-  Server: Python/3.5 aiohttp/3.5.1
-  SWH Objstorage API server
-  $ http http://127.0.0.1:5003/check_config check_write=True
-  HTTP/1.1 200 OK
-  Content-Length: 4
-  Content-Type: application/json
-  Date: Thu, 20 Jun 2019 08:06:58 GMT
-  Server: Python/3.5 aiohttp/3.5.4
-  true
-Note: in the example above, we use httpie_ as HTTP client. You can use any
-other tool (curl, wget...)
-.. _httpie: https://httpie.org
-Since we started this container attached, just hit Ctrl+C to quit in the
-terminal in which the docker container is running.
-Running in production
-~~~~~~~~~~~~~~~~~~~~~
-This container uses gunicorn as SWGI server. However, since this later does not
-handle the HTTP stack well enough for a production system, it is recommanded to
-run this behind a proper HTTP server like nginx.
-First, we start the objstorage container without exposing the TCP port, but
-using a mounted file as socket to be able to share it with other containers.
-Here, we create this socket file in `/srv/softwareheritage/objstorage.sock`::
-  $ docker run -d --name objstorage \
-      --network swh \
-      -v ${PWD}/objstorage.yml:/etc/softwareheritage/config.yml \
-      -v /srv/softwareheritage/objects:/srv/softwareheritage/objects \
-      -v /srv/softwareheritage/socks/objstorage:/var/run/gunicorn/swh \
-      swh/base objstorage
-And start an HTTP server that will proxy the UNIX socket
-`/srv/softwareheritage/socks/objstorage.sock`. Using Nginx, you can use the
-following `nginx.conf` file::
-  worker_processes  4;
-  # Show startup logs on stderr; switch to debug to print, well, debug logs when
-  # running nginx-debug
-  error_log /dev/stderr info;
-  events {
-    worker_connections 1024;
-  }
-  http {
-    include            mime.types;
-    default_type       application/octet-stream;
-    sendfile           on;
-    keepalive_timeout  65;
-    # Built-in Docker resolver. Needed to allow on-demand resolution of proxy
-    # upstreams.
-    resolver           127.0.0.11 valid=30s;
-    upstream app_server {
-      # fail_timeout=0 means we always retry an upstream even if it failed
-      # to return a good HTTP response
-      # for UNIX domain socket setups
-      server unix:/tmp/gunicorn/gunicorn.sock fail_timeout=0;
-        }
-    server {
-      listen             80 default_server;
-      # Add a trailing slash to top level requests
-      rewrite ^/([^/]+)$ /$1/ permanent;
-      location / {
-        set $upstream "http://app_server";
-        proxy_pass $upstream;
-      }
-    }
-  }
-And run nginx in a docker container with::
-  $ docker run \
-      --network swh \
-      -v ${PWD}/conf/nginx.conf:/etc/nginx/nginx.conf:ro \
-        -v /tmp/objstorage/objstorage.sock:/tmp/gunicorn.sock \
-      -p 5003:80 \
-      nginx
-Which you can check for proper fucntionning also::
-  $ http :5003/check_config check_write=True
-  HTTP/1.1 200 OK
-  Connection: keep-alive
-  Content-Length: 1
-  Content-Type: application/x-msgpack
-  Date: Thu, 20 Jun 2019 10:13:39 GMT
-  Server: nginx/1.17.0
-  true
-If you want your docker conotainers to start automatically, add the
-`--restart=always` option to docker commands above. This should prevent you
-from having to write custom service unit files.
-Manual installation on a Debian system
--------------------------------------
-Ensure you have a Debian machine with Software Heritage apt repository
-:ref:`properly configured <swh_debian_repo>`.
-There are several storage scenarios supported by the :ref:`Object Storage
-<swh-storage>`. We will focus on a simple scenario where local storage is used
-using a regular filesystem.
-Let's assume this storage capacity is available on `/srv/softwareheritage`.
- Install the Object Storage package and dependencies::
-   ~$ sudo apt install python3-swh.objstorage gunicorn3 nginx-light
- Create a dedicated `swh` user::
-   ~$ sudo useradd -md /srv/softwareheritage -s /bin/bash swh
- Create the required directory for objects storage::
-    ~$ sudo mkdir  /srv/softwareheritage/objects
-    ~$ sudo chown swh: /srv/softwareheritage/objects
- Configure the Object Storage RPC Server::
-    ~$ sudo mkdir /etc/softwareheritage/
-    ~$ sudo sh -c 'cat > /etc/softwareheritage/objstorage.yml' <<EOF
-    > objstorage:
-    >   cls: pathslicing
-    >   args:
-    >     root: /srv/softwareheritage/objects
-    >     slicing: 0:5
-    >
-    > client_max_size: 1073741824
-    > EOF
-    ~$
- Ensure the Object Storage service can be started by hand::
-    ~$ sudo -u swh swh-objstorage -C  /etc/softwareheritage/objstorage.yml serve
-    ======== Running on http://0.0.0.0:5003 ========
-    (Press CTRL+C to quit)
-  In another terminal, check the HTTP server responds properly::
-    ~$ curl 127.0.0.1:5003
-    SWH Objstorage API server
-    ~$
-  Quit the test server by hitting Ctrl+C in the terminal it is running in.
- Ensure reauired directories for gunicorn exists::
-    ~$ sudo mkdir -p /etc/gunicorn/instances
-    ~$ sudo mkdir -p /var/run/gunicorn/swh-objstorage/
-    ~$ sudo chown swh: /var/run/gunicorn/swh-objstorage/
- Copy the gunicorn config file below to `/etc/gunicorn/instances/objstorage.cfg`::
-    import traceback
-    import gunicorn.glogging
-    class Logger(gunicorn.glogging.Logger):
-        log_only_errors = True
-        def access(self, resp, req, environ, request_time):
-            """ See http://httpd.apache.org/docs/2.0/logs.html#combined
-            for format details
-            """
-            if not (self.cfg.accesslog or self.cfg.logconfig or self.cfg.syslog):
-                return
-            # wrap atoms:
-            # - make sure atoms will be test case insensitively
-            # - if atom doesn't exist replace it by '-'
-            atoms = self.atoms(resp, req, environ, request_time)
-            safe_atoms = self.atoms_wrapper_class(atoms)
-            try:
-                if self.log_only_errors and str(atoms['s']) == '200':
-                    return
-                self.access_log.info(self.cfg.access_log_format % safe_atoms, extra={'swh_atoms': atoms})
-            except:
-                self.exception('Failed processing access log entry')
-    logger_class = Logger
-    logconfig = '/etc/gunicorn/logconfig.ini'
-    # custom settings
-    bind = "unix:/run/gunicorn/swh-objstorage/gunicorn.sock"
-    workers = 16
-    worker_class = "aiohttp.worker.GunicornWebWorker"
-    timeout = 3600
-    graceful_timeout = 3600
-    keepalive = 5
-    max_requests = 0
-    max_requests_jitter = 0
-    # Uncomment the following lines if you want statsd monitoring
-    # statsd_host = "127.0.0.1:8125"
-    # statsd_prefix = "swh-objstorage"
- Copy the logging config file to `/etc/gunicorn/logconfig.ini`::
-    [loggers]
-    keys=root, gunicorn.error, gunicorn.access
-    [handlers]
-    keys=console, journal
-    [formatters]
-    keys=generic
-    [logger_root]
-    level=INFO
-    handlers=console,journal
-    [logger_gunicorn.error]
-    level=INFO
-    propagate=0
-    handlers=journal
-    qualname=gunicorn.error
-    [logger_gunicorn.access]
-    level=INFO
-    propagate=0
-    handlers=journal
-    qualname=gunicorn.access
-    [handler_console]
-    class=StreamHandler
-    formatter=generic
-    args=(sys.stdout, )
-    [handler_journal]
-    class=swh.core.logger.JournalHandler
-    formatter=generic
-    args=()
-    [formatter_generic]
-    format=%(asctime)s [%(process)d] [%(levelname)s] %(message)s
-    datefmt=%Y-%m-%d %H:%M:%S
-    class=logging.Formatter
- Ensure the Object Storage server can be started via gunicorn::
-    ~$ SWH_CONFIG_FILENAME=/etc/softwareheritage/objstorage.yml \
-       gunicorn3 -c /etc/gunicorn/instances/objstorage.cfg swh.objstorage.api.wsgi
-    [...]
-    ^C
-    ~$
- Add a `systemd` Service Unit file for this gunicorn WSGI server; copy the
-  file below to `/etc/systemd/system/gunicorn-swh-objstorage.service`::
-    [Unit]
-    Description=Gunicorn instance swh-objstorage
-    ConditionPathExists=/etc/gunicorn/instances/swh-objstorage.cfg
-    PartOf=gunicorn.service
-    ReloadPropagatedFrom=gunicorn.service
-    Before=gunicorn.service
-    [Service]
-    User=swhstorage
-    Group=swhstorage
-    PIDFile=/run/gunicorn/swh-objstorage/pidfile
-    RuntimeDirectory=/run/gunicorn/swh-objstorage
-    WorkingDirectory=/run/gunicorn/swh-objstorage
-    Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/objstorage.yml
-    Environment=SWH_LOG_TARGET=journal
-    ExecStart=/usr/bin/gunicorn3 -p /run/gunicorn/swh-objstorage/pidfile -c /etc/gunicorn/instances/objstorage.cfg swh.objstorage.api.wsgi
-    ExecStop=/bin/kill -TERM $MAINPID
-    ExecReload=/bin/kill -HUP $MAINPID
-    [Install]
-    WantedBy=multi-user.target
-  And the file below to `/etc/systemd/system/gunicorn.service`::
-    [Unit]
-    Description=All gunicorn services
-    [Service]
-    Type=oneshot
-    ExecStart=/bin/true
-    ExecReload=/bin/true
-    RemainAfterExit=on
-    [Install]
-    WantedBy=multi-user.target
- Load these Service Unit files and activate them::
-    ~$ sudo systemctl daemon-reload
-    ~$ sudo systemctl enable --now gunicorn-swh-objstorage.service
- Configure the nginx HTTP server as a reverse proxy for the gunicorn SWGI
-  server; here is an example of the file `/etc/nginx/nginx.conf`::
-    user www-data;
-    worker_processes 16;
-    worker_rlimit_nofile 1024;
-    pid        /var/run/nginx.pid;
-    error_log  /var/log/nginx/error.log error;
-    events {
-      accept_mutex off;
-      accept_mutex_delay 500ms;
-      worker_connections 1024;
-    }
-    http {
-      include       /etc/nginx/mime.types;
-      default_type  application/octet-stream;
-      access_log  /var/log/nginx/access.log;
-      sendfile    on;
-      server_tokens on;
-      types_hash_max_size 1024;
-      types_hash_bucket_size 512;
-      server_names_hash_bucket_size 128;
-      server_names_hash_max_size 1024;
-      keepalive_timeout   65s;
-      keepalive_requests  100;
-      client_body_timeout 60s;
-      send_timeout        60s;
-      lingering_timeout   5s;
-      tcp_nodelay         on;
-      gzip              on;
-      gzip_comp_level   1;
-      gzip_disable      msie6;
-      gzip_min_length   20;
-      gzip_http_version 1.1;
-      gzip_proxied      off;
-      gzip_vary         off;
-      client_body_temp_path   /var/nginx/client_body_temp;
-      client_max_body_size    10m;
-      client_body_buffer_size 128k;
-      proxy_temp_path         /var/nginx/proxy_temp;
-      proxy_connect_timeout   90s;
-      proxy_send_timeout      90s;
-      proxy_read_timeout      90s;
-      proxy_buffers           32 4k;
-      proxy_buffer_size       8k;
-      proxy_set_header        Host $host;
-      proxy_set_header        X-Real-IP $remote_addr;
-      proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
-      proxy_set_header        Proxy "";
-      proxy_headers_hash_bucket_size 64;
-      include /etc/nginx/conf.d/*.conf;
-      include /etc/nginx/sites-enabled/*;
-    }
-  `/etc/nginx/conf.d/swh-objstorage-gunicorn-upstream.conf`::
-    upstream swh-objstorage-gunicorn {
-      server     unix:/run/gunicorn/swh-objstorage/gunicorn.sock  fail_timeout=0;
-    }
-  `/etc/nginx/sites-enabled/swh-objstorage.conf`::
-    server {
-      listen 0.0.0.0:5003 deferred;
-      server_name           <hostname> 127.0.0.1 localhost ::1;
-      client_max_body_size 4G;
-      index  index.html index.htm index.php;
-      access_log            /var/log/nginx/nginx-swh-objstorage.access.log combined if=$error_status;
-      error_log             /var/log/nginx/nginx-swh-objstorage.error.log;
-      location / {
-        proxy_pass            http://swh-objstorage-gunicorn;
-        proxy_read_timeout    3600s;
-        proxy_connect_timeout 90s;
-        proxy_send_timeout    90s;
-        proxy_buffering       off;
-        proxy_set_header      Host $host;
-        proxy_set_header      X-Real-IP $remote_addr;
-        proxy_set_header      X-Forwarded-For $proxy_add_x_forwarded_for;
-        proxy_set_header      Proxy "";
-      }
-    }
-  Note the `<hostname>` in the example file above to adapt to your server name.
-  `/etc/nginx/conf.d/swh-objstorage-default.conf`::
-    server {
-      listen 0.0.0.0:5003 default_server;
-      server_name           nginx-swh-objstorage-default;
-      return 444;
-      index  index.html index.htm index.php;
-      access_log            /var/log/nginx/nginx-swh-objstorage-default.access.log combined;
-      error_log             /var/log/nginx/nginx-swh-objstorage-default.error.log;
-      location / {
-        index     index.html index.htm index.php;
-      }
-    }
- Restart the `nginx` service::
-    ~$ sudo systemctl restart nginx.service
- Check the whole stack is responding::
-    ~$ curl http://127.0.0.1:5003/
-    SWH Objstorage API serverd
-    ~$
--- a/docs/install-storage.rst
+++ b/docs/install-storage.rst
-.. highlight:: bash
-.. _storage_install:
-Graph Storage
-=============
-The machine that hosts the (graph) Storage must have access to a Postgresql
-database. It must also have access to a running objstorage instance. Setting up
-these services will not be covered here.
-In this guide, we assume that:
- Postgresql is running on machine `pghost` and the database `swh-storage`
-  exists and is owned by the postgresql user `swhuser`,
- objstorage is running on machine `objstorage` listening on port 5003.
-Docker based deployment
-----------------------
-In a docker based deployment, all machine names must be resolvable from within
-a docker container and accessible from there.
-When testing this guide on a single docker host, the simplest solution is to
-start your docker containers linked to a common bridge::
-  $ docker network create swh
-  e0d85947d4f53f8b2f0393517f373ab4f5b06d02e1efa07114761f610b1f7afa
-  $
-In the examples below we will use such a network config.
-Build the image
-~~~~~~~~~~~~~~~
-```
-$ docker build -t swh/base https://forge.softwareheritage.org/source/swh-docker.git
-```
-Configure the storage
-~~~~~~~~~~~~~~~~~~~~~
-Write a configuration file named `storage.yml`::
-  storage:
-    cls: local
-    args:
-      db: postgresql://swhuser:p4ssw0rd@pghost/swh-storage
-      objstorage:
-        cls: remote
-        args:
-          url: http://objstorage:5003
-  client_max_size: 1073741824
-Testing the configuration
-~~~~~~~~~~~~~~~~~~~~~~~~~
-Then start the test SWGI server for the RPC storage service::
-  $ docker run --rm  \
-      --network swh \
-      -v ${PWD}/storage.yml:/etc/softwareheritage/config.yml \
-      -p 5002:5000 \
-      swh/base storage
-You should be able to query this server (fraom another terminal since the
-container is started attached to its console)::
-  $ http http://127.0.0.1:5002
-  HTTP/1.1 200 OK
-  Content-Length: 25
-  Content-Type: text/plain; charset=utf-8
-  Date: Thu, 20 Jun 2019 08:02:32 GMT
-  Server: Python/3.5 aiohttp/3.5.1
-  SWH Objstorage API server
-  $ http http://127.0.0.1:5003/check_config check_write=True
-  HTTP/1.1 200 OK
-  Content-Length: 4
-  Content-Type: application/json
-  Date: Thu, 20 Jun 2019 08:06:58 GMT
-  Server: Python/3.5 aiohttp/3.5.4
-  true
-  $
-Note: in the example above, we use httpie_ as HTTP client. You can use any
-other tool (curl, wget...)
-.. _httpie: https://httpie.org
-Since we started this container attached, just hit Ctrl+C to quit in the
-terminal in which the docker container is running.
-Running in production
-~~~~~~~~~~~~~~~~~~~~~
-This container uses gunicorn as SWGI server. However, since this later does not
-handle the HTTP stack well enough for a production system, it is recommended to
-run this behind a proper HTTP server like nginx.
-First, we start the objstorage container without exposing the TCP port, but
-using a mounted file as socket to ba able to share it with other containers.
-Here, we create this socket file in `/srv/softwareheritage/objstorage.sock`::
-  $ docker run -d --name swh-objstorage \
-      --network swh \
-      -v ${PWD}/objstorage.yml:/etc/softwareheritage/config.yml \
-      -v /srv/softwareheritage/objects:/srv/softwareheritage/objects \
-      -v /srv/softwareheritage/socks:/var/run/gunicorn/swh \
-      swh/base objstorage
-And start an HTTP server that will proxy the UNIX socket
-`/srv/softwareheritage/socks/objstorage.sock`. Using Nginx, you can use the
-following `nginx.conf` file::
-  worker_processes  4;
-  # Show startup logs on stderr; switch to debug to print, well, debug logs when
-  # running nginx-debug
-  error_log /dev/stderr info;
-  events {
-    worker_connections 1024;
-  }
-  http {
-    include            mime.types;
-    default_type       application/octet-stream;
-    sendfile           on;
-    keepalive_timeout  65;
-    # Built-in Docker resolver. Needed to allow on-demand resolution of proxy
-    # upstreams.
-    resolver           127.0.0.11 valid=30s;
-    upstream app_server {
-      # fail_timeout=0 means we always retry an upstream even if it failed
-      # to return a good HTTP response
-      # for UNIX domain socket setups
-      server unix:/tmp/gunicorn/gunicorn.sock fail_timeout=0;
-      }
-    server {
-      listen             80 default_server;
-      # Add a trailing slash to top level requests
-      rewrite ^/([^/]+)$ /$1/ permanent;
-      location / {
-        set $upstream "http://app_server";
-        proxy_pass $upstream;
-      }
-    }
-  }
-And run nginx in a docker container with::
-  $ docker run \
-      --network swh \
-      -v ${PWD}/conf/nginx.conf:/etc/nginx/nginx.conf:ro \
-      -v /tmp/objstorage/objstorage.sock:/tmp/gunicorn.sock \
-      -p 5003:80 \
-      nginx
-Which you can check it is properly functionning::
-  $ http :5003/check_config check_write=True
-  HTTP/1.1 200 OK
-  Connection: keep-alive
-  Content-Length: 1
-  Content-Type: application/x-msgpack
-  Date: Thu, 20 Jun 2019 10:13:39 GMT
-  Server: nginx/1.17.0
-  true
-If you want your docker conotainers to start automatically, add the
-`--restart=always` option to docker commands above. This should prevent you
-from having to write custom service unit files.
--- a/docs/install-webapp.rst
+++ b/docs/install-webapp.rst
-.. highlight:: bash
-.. _webapp_install:
-Web Application
-===============
-The machine that hosts the front end web application must have access to all
-the Software Heritage service RPC endpoints. Some of them may be optional, in
-which case parts of the web UI won't work as expected.
-In this guide, we assume that:
- storage RPC server is running on machine `storage` listening on port 5002,
- objstorage RPC server is running on machine `objstorage` listening on
-  port 5003.
-For the metadata/indexer part to work, il will also require:
- indexer storage RPC server is running on machine `indexerstorage` listening
-  on port 5007.
-For the Vault part to work, it will also require:
- vault RPC server is running on machine `vault` listening on port 5005,
- scheduler RPC server is running on machine `scheduler` listening on
-  port 5008.
-For the Deposit part to work, it will also require:
- deposit RPC server is running on machine `vault` listening on port 5006,
- scheduler RPC server is running on machine `scheduler` listening on
-  port 5008.
-Docker based deployment
-----------------------
-In a docker based deployment, obviously, all machine names listed above must be
-resolvable from within a docker container and accessible from there.
-When testing this guide on a single docker host, the simplest solution is to
-start your docker containers linked to a common bridge::
-  $ docker network create swh
-  e0d85947d4f53f8b2f0393517f373ab4f5b06d02e1efa07114761f610b1f7afa
-  $
-In the examples below we will use such a network config.
-Build the image
-~~~~~~~~~~~~~~~
-::
-  $ docker build -t swh/web -f Dockerfile.web \
-     https://forge.softwareheritage.org/source/swh-docker.git
-  [...]
-  Successfully tagged swh/web:latest
-Configure the web app
-~~~~~~~~~~~~~~~~~~~~~
-Write a configuration file named `web.yml` like::
-  storage:
-    cls: remote
-    args:
-      url: http://storage:5002/
-      timeout: 1
-  objstorage:
-    cls: remote
-    args:
-      url: http://objstorage:5003/
-  indexer_storage:
-    cls: remote
-    args:
-      url: http://indexer-storage:5007/
-  scheduler:
-    cls: remote
-    args:
-      url: http://scheduler:5008/
-  vault:
-    cls: remote
-    args:
-      url: http://vault:5005/
-  deposit:
-    private_api_url: https://deposit:5006/1/private/
-    private_api_user: swhworker
-    private_api_password: ''
-  allowed_hosts:
-	- app_server
-  debug: no
-  serve_assets: yes
-  throttling:
-    cache_uri: null
-    scopes:
-      swh_api:
-        limiter_rate:
-          default: 120/h
-        exempted_networks:
-          - 0.0.0.0/0
-      swh_vault_cooking:
-        limiter_rate:
-          default: 120/h
-        exempted_networks:
-          - 0.0.0.0/0
-      swh_save_origin:
-        limiter_rate:
-          default: 120/h
-        exempted_networks:
-          - 0.0.0.0/0
-Testing the configuration
-~~~~~~~~~~~~~~~~~~~~~~~~~
-Then initialize the web app::
-  $ docker run --rm \
-      --network swh \
-      -v ${PWD}/web.yml:/etc/softwareheritage/config.yml \
-      swh/web migrate
-  Migrating db using swh.web.settings.production
-  Operations to perform:
-    Apply all migrations: admin, auth, contenttypes, sessions, swh.web.common
-  Running migrations:
-    No migrations to apply.
-  Creating admin user
-  $
-and start the web app::
-  $ docker run --rm  \
-      --network swh \
-      -v ${PWD}/web.yml:/etc/softwareheritage/config.yml \
-      -p 5004:5000 \
-      swh/web serve
-  starting the swh-web server
-  [...]
-You should be able to navigate the web application using your browser on
-http://localhost:5004 .
-If everything works fine, hit Ctrl+C in the terminal in which the docker
-container is running.
-Using memcache
-~~~~~~~~~~~~~~
-It is strongly advised to use a memcache for the web app. Considering such a
-service is listening on `memcache:11211`, you should adapt the
-`throttling.cache_uri` parameter of your `web.yml` file accordingly::
-  [...]
-  throttling:
-    cache_uri: memcache:11211
-  [...]
-You can easily start such a memcached server using::
-  $ docker run --name memcache --network swh -d memcached
-Running in production
-~~~~~~~~~~~~~~~~~~~~~
-This container uses gunicorn as WSGI server. However, since this later does not
-handle the HTTP stack well enough for a production system, it is recommended to
-run this behind a proper HTTP server like nginx via a unix socket.
-First, we start the webapp container without exposing the TCP port, but
-using a mounted file as socket to be able to share it with other containers.
-Here, we create this socket file in `/srv/softwareheritage/socks/web.sock`::
-  $ docker run -d --name webapp \
-      --network swh \
-      -v ${PWD}/web.yml:/etc/softwareheritage/config.yml \
-      -v /srv/softwareheritage/socks:/var/run/gunicorn/swh \
-      swh/web serve
-And start an HTTP server that will proxy the UNIX socket
-`/srv/softwareheritage/socks/web/sock`. Using Nginx, you can use the
-following `nginx.conf` file::
-  worker_processes  4;
-  # Show startup logs on stderr; switch to debug to print, well, debug logs when
-  # running nginx-debug
-  error_log /dev/stderr info;
-  events {
-    worker_connections 1024;
-  }
-  http {
-    include            mime.types;
-    default_type       application/octet-stream;
-    sendfile           on;
-    keepalive_timeout  65;
-    # Built-in Docker resolver. Needed to allow on-demand resolution of proxy
-    # upstreams.
-    resolver           127.0.0.11 valid=30s;
-    upstream app_server {
-      # fail_timeout=0 means we always retry an upstream even if it failed
-      # to return a good HTTP response
-      # for UNIX domain socket setups
-      server unix:/tmp/gunicorn/sock fail_timeout=0;
-      }
-    server {
-      listen             80 default_server;
-      location / {
-        set $upstream "http://app_server";
-        proxy_pass $upstream;
-      }
-    }
-  }
-Note that the `app_server` name in this file above must be listed in the
-`allowed_hosts` config option in the `web.yml` file.
-And run nginx in a docker container with::
-  $ docker run -d \
-    --network swh \
-    -v ${PWD}/conf/nginx.conf:/etc/nginx/nginx.conf:ro \
-    -v /srv/softwareheritage/socks/web:/tmp/gunicorn \
-    -p 5004:80 \
-    nginx
-Which you can check it is properly functionning navigating on http://localhost:5004
-If you want your docker conotainers to start automatically, add the
-`--restart=always` option to docker commands above. This should prevent you
-from having to write custom service unit files.
--- a/docs/installation.rst
+++ b/docs/installation.rst
-.. highlight:: bash
-.. _installation:
-How to Set Up a Software Heritage Archive
-=========================================
-This series of guides will help you install a (partial) Software Heritage
-Instance in a production system.
-The global :ref:`architecture of the Software Heritage Archive <architecture>`
-looks like:
-.. thumbnail:: images/general-architecture.svg
-   General view of the |swh| architecture.
-Each component can be installed alone, however there are some dependencies
-between those services.
-The base system used for these guides is a Debian system running the latest
-stable version.
-Components of the archicture can be installed on a sigle node (not recommended)
-or on set (cluster) of machines.
-.. _swh_debian_repo:
-Debian Repository
-~~~~~~~~~~~~~~~~~
-On each machine, the Software Heritage apt repository must be configured::
-  ~$ sudo apt install apt-transport-https lsb-release
-  ~$ echo deb [trusted=yes] https://debian.softwareheritage.org/ $(lsb_release -cs)-swh main | \
-        sudo sh -c 'cat > /etc/apt/sources.list.d/softwareheritage.list'
-  ~$ sudo apt update
-.. toctree::
-   install-objstorage