AFN mail gateway: redesign to increase resilience and reliability
Since I started working for SWH, at the start of almost each of the new contract periods, I have had to ask for the gateway from AFN mail addresses to the AFN website to be fixed. The symptoms involved have been undelivered mail and bounces to both the mail I sent and to mails sent by forge admins. After the issues were fixed I had to manually bounce the failed mails to each of their corresponding AFN mail to web gateway email addresses. This is a sub-optimal experience for SWH sysadmins, AFN moderators and forge admins.
So I would like the SWH sysadmin team to proactively and automatically discover, diagnose, fix and recover from issues with the AFN mail to web gateway.
So I request a more resilient and reliable design for the mail gateway, something along these lines should work:
- Incoming mail would be delivered directly to on-disk per-AFNR INBOX Maildir folders
- A daemon would monitor the INBOX Maildir folders using inotify or fanotify
- On startup existing mails would be sent to the AFN mail API
- On inotify/fanotify events, new mails would be sent to the AFN mail API
- AFN mail API failure conditions would be alerted to sysadmins
- Periodic retries of failed mails would be automatically done
- Succeeded mails would move to long-term storage in separate per-AFNR Archive Maildir folders
- Presence of the daemon process would be monitored and alerted when missing
- The AFNR Archive Maildir folders would get archived to compressed tar files after a year of inactivity
- The storage containing the INBOX/Archive Maildir folders would be monitored for available space and alerted when almost full
- The amount of mails in the INBOX Maildir folders would be monitored and alerted when more than two
- A special test AFNR address would be setup (for eg AFNR#0)
- A periodic mail to the test AFNR address would be sent
- The test address AFNR status would get monitored and alerted when not
Waiting for feedback
- The test address AFNR modification date would get monitored and alerted when too old
- The test address AFNR database records would get reverted periodically
- The test address AFNR Archive Maildir folder would get emptied periodically