Applying Data Compression to the SQL Sentry Database : Part 5 (Follow-up)
Recently, I completed a four-part series on Applying Data Compression to the SQL Sentry Database (Part 1 | Part 2 | Part 3 | Part 4). You might be asking yourself why I am posting Part 5 if I recently completed the series and the answer is that I wanted to follow up on some comments from the previous parts. The comments provided some questions and helpful suggestions from Kevin Kline (b | t) and Glenn Berry (b | t). Kevin wondered about wait stats that might be specific to data compression and Glenn provided a tip on setting MAXDOP while applying compression to prevent the process from consuming all of the available CPU.
Data Compression and Wait Statistics
For Kevin's question on wait statistics, I utilized SQL Sentry Performance Advisor to collect and view the SQL Server Waits. I reused the test databases and environment from my data compression series as described in the previous posts. The first thing I did while looking for wait statistics related to data compression was rebuild all of the indexes in my test databases, where each database had a method of compression previously implemented (none, row, or page). I could see the category of waits in real-time on the SQL Sentry Performance Advisor Dashboard. Screens from those trials are shown below.
Figure 1 showing SQL Server Waits in the Performance Advisor Dashboard for the database with no compression
Figure 2 showing SQL Server Waits in the Performance Advisor Dashboard for the database with row compression
Figure 3 showing SQL Server Waits in the Performance Advisor Dashboard for the database with page compression
The orange shaded areas in the figures above represent the Disk category of waits. As one might expect, the reduced I/O and disk-related activity in the compressed databases leads to fewer disks waits when using those databases. Visually, it is easy to see the significant reduction in orange from Figure 1 to Figure 2, and the even more dramatic reduction when comparing Figure 1 to Figure 3. From the Performance Advisor Dashboard it is simple to navigate to the details of SQL Server Waits by category, class, and type by right clicking on the screen and selecting Quick Report. This will allow you to view specific percentages and actual wait times in milliseconds (ms).
Figure 4 showing how to generate the SQL Server Wait Stats Analysis Report from Performance Advisor
Figure 5 showing the SQL Server Wait Stats Analysis Report
The SQL Server Wait Stats Analysis can be exported to Excel, PDF, or Word, making it easy to compare the wait statistics from different trials and baselines. If you like to use wait statistics as part of your performance troubleshooting and tuning (and who doesn't?) then I'm sure you can see how this could simplify the data collection and presentation for you.
In a subsequent test, I examined how wait statistics changed over a period of two hours while the SQL Sentry database was using the different methods of compression during its usual monitoring. As shown below, there was a reduction in disk waits, which resulted in fewer waits overall.
Figure 6 showing SQL Server Waits by Category for the database with no compression
Figure 7 showing SQL Server Waits by Category for the database with row compression
Figure 8 showing SQL Server Waits by Category for the database with page compression
During my testing, I did not come across any wait statistics that were specific to data compression. Kevin also dug into this more and responses from those in the SQL Server Community indicated that a wait statistic specific to data compression overhead does not exist. There were some helpful suggestions regarding specific Extended Events for Columnstore indexes, which I plan to explore in the future.
Data Compression and Max Degree of Parallelism
The next thing I wanted to demonstrate in this post was based on Glenn Berry's comment on Part 2 of my series where he said, "One thing you can (and probably should) do is to set MAXDOP = 1 when you rebuild your indexes to apply ROW or PAGE compression, especially if you are doing it with ONLINE = ON. This will usually make it take longer, (unless sequential I/O capacity is your bottleneck), but it will reduce the CPU impact to your system, since only one processor core will be rebuilding the index." I thought this was great advice and wanted to see the impact on my test environment.
Figure 9 showing CPU utilization for each method of compression with unrestricted and restricted MAXDOP settings
Figure 9, above, shows a comparison of the CPU utilization when the Maximum Degree of Parallelism was unrestricted (MAXDOP=0) and when it was restricted to a single CPU core (MAXDOP=1). When MAXDOP was unrestricted, the average CPU utilization ranged from 30% to 73%, and while applying row or page compression, it spiked to just under 100%. When MAXDOP is restricted to a single CPU, the average CPU utilization is below 13% for all three methods, which makes sense because this server has 8 cores (100/8 = 12.5 and it's using that core near or at capacity to apply the compression). Total CPU utilization might average or spike higher than that, of course, as other things are happening on the system, but on a quiet test server you will likely be able to see something similar in the average when restricting MAXDOP. Performance Advisor also provides a clear view of this while monitoring the system. Figure 10, below, shows an example of applying compression with MAXDOP = 1 in my test environment.
Figure 10 showing the process limited to a single core while applying page compression
As Glenn had mentioned, restricting MAXDOP does come with the strong likelihood of increasing the amount of time required for the process to complete. You can see the difference in my lab environment below. It is rather significant for applying page compression in this example.
Figure 11 showing total duration to apply each method of compression with unrestricted and restricted MAXDOP settings
I hope this post has provided you with some helpful additional guidance for applying data compression to your SQL Sentry database or other SQL Server databases, as well as ideas on how you can use SQL Sentry Performance Advisor for your own testing and analysis. Thank you for reading, and thank you to Kevin and Glenn for providing the feedback, which prompted me to write this follow-up.