Title: Metrics, Gauges, Counters and Ratios; designing and implementing quality metrics


The only thing worse than no metrics are bad and/or misleading ones. Well-designed metrics enable you to quickly know the state of your service and have confidence that your systems are healthy. Poor metrics distract you from finding root causes of outages and extend downtime. Unfortunately it isn’t always obvious what counts and how to count it.

This talk will cover the essential attributes needed in quality metrics and walk participants through the steps needed to capture them in a useful format while avoiding common pitfalls in metric design. The principles of metric design, types of metrics, and when to use the common types will be covered. Come learn about ratios, gauges, counters; primary, secondary, proxy and derived metrics; as well as intervals, ordinals and more.


Caskey L. Dickson is a Site Reliability Engineer at Microsoft where he is part of the leadership team reinventing operations at Azure. Before that he was at Google where he worked as an SRE/SWE writing and maintaining monitoring services that operate at “Google scale” as well as a few business intelligence pipelines and maybe a script or two. He has worked in online services since 1995 when he turned up his first web server and has been online ever since. Before working at Google, he was a senior developer at Symantec, wrote software for various Internet startups such as CitySearch and CarsDirect, ran a consulting company, and even taught undergraduate and graduate computer science at Loyola Marymount University. He has a B.S. in Computer Science, a Masters in Systems Engineering, and an M.B.A from Loyola Marymount.