Monitoring alerts from APM and other tools are being notoriously linked to alert-fatigue among the on-call operations team members. A significant fraction of these are false positives which end up wasting engineering / troubleshooting time. It is also not unusual to have different APM software to monitor different parts of the tech stack. Also, the potential for output customization in APM software is often limited.
The escalation process for alert resolution often goes through multiple people before it is finally resolved. There is room for saving engineering time here as well.
You will learn how to build a status communication system / pipeline. You will gain insights into effective status communication strategies and tricks to minimize false positive alerts. You will be exposed to efficient alert management strategies. Last, you will learn how to leverage version control to seamlessly switch between different status communication / alerting configurations. (Potentially perform Blue-Green deployments and/or AB Testing of the dashboard)