Skip to content
Snippets Groups Projects
Verified Commit 7147b9c0 authored by Vincent Sellier's avatar Vincent Sellier
Browse files

document the firewalls upgrade procedure

initiate an infrastructure section and a network sub-section

Remark: the png of the plantuml diagram is committed because
the svg in not correctly rendered

Related to T3203
parent 3d546edd
No related branches found
No related tags found
No related merge requests found
docs/images/infrastructure/network/carp_maintenance.png

11 KiB

docs/images/infrastructure/network/check_for_upgrade.png

36.8 KiB

docs/images/infrastructure/network/proceed_update.png

12.4 KiB

docs/images/infrastructure/network/reactivate_carp.png

11 KiB

docs/images/infrastructure/network/sync.png

20.8 KiB

docs/images/network.png

25.5 KiB

<?xml version="1.0" encoding="UTF-8" standalone="no"?><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" contentScriptType="application/ecmascript" contentStyleType="text/css" height="129px" preserveAspectRatio="none" style="width:357px;height:129px;background:#000000;" version="1.1" viewBox="0 0 357 129" width="357px" zoomAndPan="magnify"><defs/><g><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="231" x="5" y="17.9951">[From network.uml (line 19) ]</text><line style="stroke: #33FF02; stroke-width: 1.0;" x1="5" x2="355" y1="26.2969" y2="26.2969"/><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="185" x="5" y="44.292">... (skipping 15 lines) ...</text><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="86" x="45" y="60.5889">pergamon;</text><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="0" x="10" y="76.8857"/><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="62" x="45" y="93.1826">group {</text><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="290" x="65" y="109.4795">description = "&lt;b&gt;FIREWALLS&lt;/b&gt;";</text><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="0" x="10" y="125.7764"/><text fill="#FF0000" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="106" x="10" y="125.7764">Syntax Error?</text><!--
@startuml
nwdiag {
inet [ shape = cloud ];
inet - - inria_gw;
network VLAN210 {
louvre [address = "VPN" ];
inria_gw [description = "INRIA GW"];
}
network VLAN1300 {
workers;
kafka;
inria_gw;
forge;
pergamon;
group {
description = "<b>FIREWALLS</b>";
pushkin;
glyptotek;
}
}
network VLAN440 {
workers;
pushkin;
glyptotek;
louvre;
forge;
kafka;
pergamon;
production_nodes [description = "Production nodes"];
}
network VLAN443 {
pushkin;
glyptotek;
staging_nodes [description = "Staging nodes"];
}
network VLAN442 {
pushkin;
glyptotek;
admin_nodes [description = "Admin nodes"];
}
}
@enduml
PlantUML version 1.2018.13(Mon Nov 26 18:11:51 CET 2018)
(GPL source distribution)
Java Runtime: OpenJDK Runtime Environment
JVM: OpenJDK 64-Bit Server VM
Java Version: 11.0.11+9-post-Debian-1deb10u1
Operating System: Linux
OS Version: 5.10.0-0.bpo.4-amd64
Default Encoding: UTF-8
Language: en
Country: US
--></g></svg>
\ No newline at end of file
@startuml
nwdiag {
inet [ shape = cloud ];
inet -- inria_gw;
network VLAN210 {
louvre [address = "VPN" ];
inria_gw [description = "INRIA GW"];
}
network VLAN1300 {
workers;
kafka;
inria_gw;
forge;
pergamon;
group {
description = "<b>FIREWALLS</b>";
pushkin;
glyptotek;
}
}
network VLAN440 {
workers;
pushkin;
glyptotek;
louvre;
forge;
kafka;
pergamon;
production_nodes [description = "Production nodes"];
}
network VLAN443 {
pushkin;
glyptotek;
staging_nodes [description = "Staging nodes"];
}
network VLAN442 {
pushkin;
glyptotek;
admin_nodes [description = "Admin nodes"];
}
}
@enduml
......@@ -52,6 +52,11 @@ Roadmap
* :ref:`roadmap-2021`
Engineering
-----------
* :ref:`infrastructure`
Components
----------
......@@ -196,6 +201,7 @@ Indices and tables
contributing/index
tutorials/index
roadmap/roadmap-2021.rst
infrastructure/index
swh.auth <swh-auth/index>
swh.core <swh-core/index>
swh.counters <swh-counters/index>
......
.. _infrastructure:
Infrastructure
##############
.. keep this in sync with the 'sysadm' section in swh-docs/docs/index.rst
This section regroups the knowledge base and procedures relative to the |swh| infrastructure management.
.. toctree::
:maxdepth: 2
:titlesonly:
network
Network documentation
#####################
.. keep this in sync with the 'sysadm' section in swh-docs/docs/index.rst
This section regroups the knowledge base for our network components.
.. toctree::
:maxdepth: 2
:titlesonly:
Network architecture
********************
The network is split in several VLANs provided by the INRIA network team:
.. thumbnail:: ../images/network.png
Firewalls
=========
The firewalls are 2 `OPNsense <https://opnsense.org>`_ VMs deployed on the PROXMOX cluster with an `High Availability <https://docs.opnsense.org/manual/hacarp.html?highlight=high%20availability>`_ configuration.
They are sharing a virtual IP on each VLAN to act as the gateway. Only one of the 2 firewalls is owning all the GW ips at the same time. The owner is called the ``PRIMARY``
.. list-table::
:header-rows: 1
* - Nominal Role
- name (link to the inventory)
- login page
* - PRIMARY
- `pushkin <https://inventory.internal.softwareheritage.org/virtualization/virtual-machines/75/>`_
- `https://pushkin.internal.softwareheritage.org <https://pushkin.internal.softwareheritage.org>`_
* - BACKUP
- `glyptotek <https://inventory.internal.softwareheritage.org/virtualization/virtual-machines/86/>`_
- `https://glyptotek.internal.softwareheritage.org <https://glyptotek.internal.softwareheritage.org>`_
Configuration backup
--------------------
The configuration is automatically committed on a `git repository <https://forge.softwareheritage.org/source/iFWCFG/branches/master/>`_.
Each firewall regularly pushes its configuration on a dedicated branch of the repository.
The configuration is visible on the `System / Configuration / Backups <https://pushkin.internal.softwareheritage.org/diag_backup.php>`_ page
of each one.
Upgrade procedure
-----------------
Initial status
^^^^^^^^^^^^^^
This is the nominal status of the firewalls:
.. list-table::
:header-rows: 1
* - Firewall
- Status
* - pushkin
- PRIMARY
* - glyptotek
- BACKUP
Preparation
^^^^^^^^^^^
* Connect to the `principal <https://pushkin.internal.softwareheritage.org>`_ (pushkin here)
* Check the `CARP status <https://pushkin.internal.softwareheritage.org/carp_status.php>`_ to ensure the firewall is the principal (must have the status MASTER for all the IPS)
* Connect to the `backup <https://glyptotek.internal.softwareheritage.org>`_ (glytotek here)
* Check the `CARP status <https://glyptotek.internal.softwareheritage.org/carp_status.php>`_ to ensure the firewall is the backup (must have the status BACKUP for all the IPS)
* Ensure the 2 firewalls are in sync:
* On the principal, go to the `High availability status <https://pushkin.internal.softwareheritage.org/status_habackup.php>`_ and force a synchronization
* click on the button on the right of ``Synchronize config to backup``
.. image:: ../images/infrastructure/network/sync.png
* Switch the principal/backup to prepare the upgrade of the master
(The switch is transparent from the user perspective and can be done without service interruption)
* [1] On the principal, go to the `Virtual IPS status <https://pushkin.internal.softwareheritage.org/carp_status.php>`_ page
* Activate the CARP maintenance mode
.. image:: ../images/infrastructure/network/carp_maintenance.png
* check the status of the VIPs, they must be ``BACKUP`` on pushkin and ``PRIMARY`` on glyptotek
* wait a few minutes to let the monitoring detect if there are connection issues, check ssh connection on several servers on different VLANs (staging, admin, ...)
If everything is ok, proceed to the next section.
Upgrade the first firewall
^^^^^^^^^^^^^^^^^^^^^^^^^^
Before starting this section, the firewall statuses should be:
.. list-table::
:header-rows: 1
* - Firewall
- Status
* - pushkin
- BACKUP
* - glyptotek
- PRIMARY
If not, be sure of what you are doing and adapt the links accordingly
* [2] go to the `System Firmware: status <https://pushkin.internal.softwareheritage.org/ui/core/firmware#status> `_ page (pushkin here)
* Click on the ``Check for upgrades`` button
.. image:: ../images/infrastructure/network/check_for_upgrade.png
* follow the interface indication, one or several reboots can be necessary depending to the number of upgrade to apply
.. image:: ../images/infrastructure/network/proceed_update.png
* repeat from the ``Check for upgrades`` operation until there is no upgrades to apply
* Switch the principal/backup to restore ``pushkin`` as the principal:
* on the current backup (pushkin here) go to `Virtual IPS status <https://pushkin.internal.softwareheritage.org/carp_status.php>`_
* [3] click on `Leave Persistent CARP Maintenance Mode`
.. image:: ../images/infrastructure/network/reactivate_carp.png
* refresh the page, the role should have changed from ``BACKUP`` to ``MASTER``
* check on the other firewall, if the roles is indeed ``BACKUP`` for all the IPs
* Wait few moment to ensure everything is ok with the new version
Upgrade the second firewall
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Before starting this section, the firewall statuses should be:
.. list-table::
:header-rows: 1
* - Firewall
- Status
* - pushkin
- PRIMARY
* - glyptotek
- BACKUP
If not, be sure of what you are doing and adapt the links accordingly
* Proceed to the second firewall upgrade
* perform [1] on the backup (should be ``glyptotek`` here)
* perform [2] on the backup (should be ``glyptotek`` here)
* perform [3] on the backup (should be ``glyptotek`` here)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment