In the first two posts of this series (Part 1 | Part 2), I discussed the space savings that can be achieved with data compression, as well as the overhead cost – specifically, additional CPU overhead. In this post, I am going to give examples of additional benefits attained with data compression and demonstrate the decision-making process used when applying data compression in production.
Decreased memory consumption can be a benefit of data compression. If you recall from my second post, I executed some tests that had SQL Sentry monitoring one server and five servers with the different methods of compression. The chart below shows less memory was required for the same workload when compression was implemented.
Figure 3-A showing average total memory consumption with and without compression
The pages that are compressed on disk are also compressed in memory, which means that queries read fewer pages in memory when reading from compressed tables. This can increase performance by allowing more data into the buffer cache. In looking at some specific queries, the initial execution time (using cold cache) was markedly improved with each level of compression. The numbers below show an example from a query using PerformanceAnalysisData:
Figure 3-B showing query metrics with and without compression
Since pages are compressed on disk, this also lowers disk I/O, which can be particularly helpful on a system where disk is often a bottleneck. There was a substantial improvement in disk latency in my test environment for reads.
Figure 3-C showing average disk latency for reads with and without compression
Now that I have shown you some of the risks and benefits of compressing data, let’s move on to how to decide which indexes should be compressed.
This technical article from Microsoft titled, "Data Compression: Strategy, Capacity Planning, and Best Practices," which I mentioned in my first post, has an excellent example of a decision-making table as well as the T-SQL to find "U: Percent of Update Operations on the Object" and "S: Percent of Scan Operations on the Object." If something has a high U value, then CPU overhead increases with each level of compression. If something has a high S value, then you should benefit from using compression in queries. I used this process and added a column for disk space because some of the indexes are tiny and not necessarily worth compressing. In the world of SQL Server, it always depends, right? Keep in mind that the S and U values returned in the queries are from the last time SQL Server was started or those stats were last cleared.
The disk space values were based on the production database I was using:
Disk Space Column
Tiny <= 512MB
Small >512MB && <= 1GB
Medium >1GB && <= 5GB
Large >5GB && <= 10GB
Huge >10GB && <= 15GB
Below is IX_PerformanceAnalysisData_Wide on PerformanceAnalysisData. This is the only index on this table, which is one of the largest in the SQL Sentry database. It has scan activity, but no updates. As a "Gigantic" index that compresses significantly with page, I recommend using page compression. I noted, "ROW if CPU concerns," and should clarify that those are concerns about CPU during applying compression or index maintenance, not during normal activity. Since the updates are low to none, there is not much to cause CPU concerns otherwise, and as previously shown, the average impact is typically less than 1%.
Figure 3-D showing decision table for PerformanceAnalysisData
The PerformanceAnalysisPlan table that I showed in my first post does not contain any indexes that I would recommend compressing:
Figure 3-E showing decision table for PerformanceAnalysisPlan
The PerformanceAnalysisTraceData table is an example of mixing compression types across indexes. The "% Savings" on PK_PerformanceAnalysisTraceData is similar for row and page and has high U compared to S. It does not compress very well, but is gigantic in size, which is why I recommend row. IX_MaxIDs is a tiny index, but has a decent amount of scans, so it may be helpful to performance to compress that data. IX_PerformanceAnalysisTraceData_Wide is well suited for page compression for the "% Savings" and minimal activity.
Figure 3-F showing decision table for PerformanceAnalysisTraceData
In my next post, I will provide the scripts for applying the recommended compression and some tips on how to execute them in your database.