As teams grow and systems become more complicated, what used to be a simple set of commands quickly become branching logic chains that require insight and expertise. Runbooks serve as a decent structure for communicating this to teams around the globe but may miss crucial information that only an expert would know. Finally, runbooks are only as good as the last time they were updated, and in regular use, they stay out of typical workflows and become forgotten until they’re needed most.
This talk describes Slack’s approach to good runbook hygiene, as well as our process for moving to automated tools, and how it’s helped us scale our teams and infrastructure. We’ll dive into the tradeoffs of automation, as well as how to make sure the process is accessible to all members of the team, allowing them to gain familiarity and skills with tooling. Attendees will be able to not only improve their runbook contents and format but learn how to make code speak for operational procedures and ensure they are working best when needed most.