Applying Data Compression to the SQL Sentry Database : Part 2
In my first post in this series, I described some of the space savings that can be achieved with compression. In this post, I am going to focus on some of the costs.
One of the first costs to consider is the application of the compression itself. Compression is applied through an alter index rebuild statement (could also be a table when compressing a heap, but everything we will be working with in this series is the clustered index or a non-clustered index).
Altering and rebuilding those fifty indexes from the test I described in Part I with the compression set to "none" took seven minutes and thirty seconds, "row" took seven minutes and fifty-five seconds, and then "page" jumped up to seventeen minutes and twelve seconds. The additional page compression algorithms demand more resources from the system.
Figure 2-A showing time to apply compression the SQL Sentry test database
What happened to CPU during those times? Thanks to the "Create Baseline" feature in SQL Sentry’s Performance Advisor Dashboard, I was quickly able to collect some metrics. The average Total CPU Usage % nearly doubled when applying row compression and more than doubled when applying page compression. The maximum values nearly doubled for row and page, both hitting almost 100%. Clearly, you want to make sure that you apply compression at a time when your system has some extra CPU available.
Figure 2-B showing average and maximum CPU values while applying compression the SQL Sentry test database
For the record, the machine used in this test (and subsequent trials in this test environment) had Intel Core i7-2600S CPU @ 2.80GHz with 1 socket and four cores (8 logical).
Another cost to consider when applying compression is the impact that it will have to your maintenance routine, specifically if that scheduled maintenance includes rebuilding indexes. I used SQL Sentry’s Fragmentation Manager to rebuild all indexes in my test database for some comparisons. In my first trial, I executed the defragment against all indexes in the database, with the rebuild set to offline.
Figure 2-C showing SQL Sentry's Fragmentation Manager menu
The duration differences seen when rebuilding the indexes are similar to the differences seen when applying compression. Rebuilding indexes took about the same time when compression was set to none and row, but nearly doubled when page compression was used in the database.
Figure 2-D showing the time to rebuild all indexes offline in SQL Sentry Fragmentation Manager
What about CPU usage during the rebuild? Looking at the chart below, you can see that average was two and a half times greater when using the database with row compression, and it more than tripled when page compression was in use.
Figure 2-E showing total CPU utilization while rebuilding all indexes offline
The following graph – from the Performance Advisor dashboard – shows CPU usage during the rebuild. The baseline without compression (in magenta) against page compression (in light blue):
CPU usage during rebuilds: comparing page compression to no compression
In my second trial, I executed the defragment against all indexes in the database, with the rebuild set to online. Here, row compression has more of an impact than it did offline, and page compression shows its more significant impact.
Figure 2-F showing the time to rebuild all indexes online in SQL Sentry Fragmentation Manager
Rebuilding the indexes online is more CPU intensive than offline in general and there is less of a difference between none, row, and page compared to the offline rebuilds.
Figure 2-G showing total CPU utilization while rebuilding all indexes online
In another round of testing, I observed the overhead of using compression while the SQL Sentry software was doing its job of monitoring with Performance Advisor and Event Manager. I did this while SQL Sentry was monitoring one server in one set of tests and while monitoring five servers in another set.
Figure 2-H showing average total CPU utilization while using SQL Sentry Performance Advisor and Event Manager with and without compression
The average CPU impact was under 1% during these tests. Maximum CPU may experience higher spikes. I feel like it is important to mention at this point that many of the tables involved experience far more inserts and deletes than updates, which minimizes the CPU impact of data compression. That is something I will discuss in more detail in my next post.
In addition to playing in my test environment for this project, I also applied compression to the largest tables on disk in one of our busy production environments. I did this more sensibly, because I wanted it to be performing the way I will be recommending it to you. In addition, I did not want to risk having my access to the production server (or building) revoked! You can quickly get a list of the largest indexes by disk usage in your SQL Server database by going into Performance Advisor –> Indexes –> Grid. You could also use the Tree tab to group them by table. Depending on what a table or index is storing, the largest one isn’t necessarily the one with the most rows (click to enlarge).
In production, I applied page level compression to IX_PerformanceAnalysisData_Wide on PerformanceAnalysisData. Applying page compression to this single index took nine minutes, spiked the CPU to 95%+ for about five minutes, and increased disk waits and spikes to disk I/O. Applying page level compression to this one index saved 10GB of space, taking the index from 16GB down to 6GB. The next morning I looked at the stats to confirm that I had not destroyed the production environment and, that night, proceeded to apply more. I page compressed IX_PerformanceAnalysisData_Wide on PerformanceAnalysisDataDiskCounter. This took over seventeen minutes to complete, spiked CPU to 97%+ for twelve minutes, and increased disk waits and spikes to disk I/O. The index was reduced from nearly 23GB to 9GB with page compression.
I checked for total destruction the following morning and, upon finding the system alive and well, proceeded with my compression madness. I took a set of eighteen-hour long baselines from before applying any compression and another set that included the compressed indexes for the same time. For example, I had the CPU usage from Tuesday, Wednesday, and Thursday before any compression had been applied to the database and then the following Tuesday, Wednesday, and Thursday, which included my compression.
Figure 2-J showing CPU usage in production before and after applying data compression to some of the largest indexes
The average CPU in production was minimally impacted as my earlier testing has indicated it would. Isn’t it fantastic when something turns out just the way it did in test?
In my next post I will illustrate some of the additional benefits of data compression (beyond space savings) and walk through the decision-making process for deciding which indexes to compress and which method to use when compressing. One of the great things about data compression is the flexibility in how you can apply it. A mix of none, row, and page can be used across indexes on the same table. That flexibility is something we will take advantage of when applying compression in the SQL Sentry database.