I've never been in the position where I only had one or two SQL Servers to manage, so when I monitor performance, I like to be efficient. The better word is probably lazy, but efficient sounds better. There are always ways to alert if the server is in trouble, and for those you can typically use hard numbers. For example: CPU – % Processor Time > 90%, Windows available memory < 100 MB. What I also want to alert on is when the behavior on my servers isn't necessarily in the danger zone, but is much different than normal. Since it seems like no two servers behave the same way, that type of alerting takes a lot of research. Unless, of course, you're using SQL Sentry Performance Advisor.
With the introduction of v8 (not the drink :) ), Performance Advisor introduced the ability to create custom baselines as well as the means to create custom conditions based on those baselines. These features allowed me to create baselines of normal activity on each server and create conditions to alert me if my CPU utilization was, say, 40% above my baseline, or if my PLE was 20% less than my baseline, and that was great. It still took some manual research though, because the intention was to alert when the server performance varied from the norm. Some servers have more variance in 'normal' activity than others, so I still needed to figure out what that threshold was for each of the servers.
With the recent release of v8.4, we can now alert based on standard deviations and that, my friends, is awesome. Unfamiliar with standard deviations? Well, it's basically this:
That didn't help you? It wouldn't help me either. My son in high school could figure that out, but not me.
I do, however, understand the concept of standard deviations. Most sets of data fall into a normal distribution (or bell curve, or Gaussian distribution). You may remember bell curves from school – I used them for grading test scores. How does standard deviation play into this? Standard deviation, in its most simple sense, is how spread out each of the data points are from the mean (or average). Using the 68-95-99.7 rule (nerd alert!) we can know, for the most part, in a normal distribution, that 68% of the values fall within one standard deviation of the mean, 95% of the values fall within two standard deviations from the mean, and 99.7% of the values fall within three standard deviations from the mean. (And if this is fascinating to you, I verified my information here.)
So how can we make use of standard deviations in alerting within SQL Sentry Performance Advisor? The first step would be to create a custom baseline representing a period of normal activity on the server. This will provide you with the mean (or average) metrics. We know that any values that fall two or three standard deviations from that mean are well outside of the expected values; this would represent server behavior that is different than "normal." We can use custom conditions to capture those abnormal values and create alerts when those events occur. In the example below, we are alerting on the
Processor Information: % Processor time counter when it is 2 standard deviations above the mean:
Now create custom baselines on all of your servers with the same name, covering a similar time frame and use that same exact custom condition on all of your servers. It would alert on the same outliers regardless of the activity on those servers.
That is efficiency at its best. Have fun with it!