Data Compression in the SQL Sentry Database, Part 3

Applying Data Compression to the SQL Sentry Database : Part 3

In the first two posts of this series (Part 1 | Part 2), I discussed the space savings that can be achieved with data compression, as well as the overhead cost – specifically, additional CPU overhead. In this post, I am going to give examples of additional benefits attained with data compression and demonstrate the decision-making process used when applying data compression in production.

Decreased memory consumption can be a benefit of data compression. If you recall from my second post, I executed some tests that had SQL Sentry monitoring one server and five servers with the different methods of compression. The chart below shows less memory was required for the same workload when compression was implemented.

Figure 3-A showing average total memory consumption with and without compression

The pages that are compressed on disk are also compressed in memory, which means that queries read fewer pages in memory when reading from compressed tables. This can increase performance by allowing more data into the buffer cache. In looking at some specific queries, the initial execution time (using cold cache) was markedly improved with each level of compression. The numbers below show an example from a query using PerformanceAnalysisData:

Figure 3-B showing query metrics with and without compression

Since pages are compressed on disk, this also lowers disk I/O, which can be particularly helpful on a system where disk is often a bottleneck. There was a substantial improvement in disk latency in my test environment for reads.

Figure 3-C showing average disk latency for reads with and without compression

Now that I have shown you some of the risks and benefits of compressing data, let’s move on to how to decide which indexes should be compressed.

This technical article from Microsoft titled, "Data Compression: Strategy, Capacity Planning, and Best Practices," which I mentioned in my first post, has an excellent example of a decision-making table as well as the T-SQL to find "U: Percent of Update Operations on the Object" and "S: Percent of Scan Operations on the Object." If something has a high U value, then CPU overhead increases with each level of compression. If something has a high S value, then you should benefit from using compression in queries. I used this process and added a column for disk space because some of the indexes are tiny and not necessarily worth compressing. In the world of SQL Server, it always depends, right? Keep in mind that the S and U values returned in the queries are from the last time SQL Server was started or those stats were last cleared.

The disk space values were based on the production database I was using:

Disk Space Column
Tiny <= 512MB
Small >512MB && <= 1GB
Medium >1GB && <= 5GB
Large >5GB && <= 10GB
Huge >10GB && <= 15GB
Gigantic >15GB

Below is IX_PerformanceAnalysisData_Wide on PerformanceAnalysisData. This is the only index on this table, which is one of the largest in the SQL Sentry database. It has scan activity, but no updates. As a "Gigantic" index that compresses significantly with page, I recommend using page compression. I noted, "ROW if CPU concerns," and should clarify that those are concerns about CPU during applying compression or index maintenance, not during normal activity. Since the updates are low to none, there is not much to cause CPU concerns otherwise, and as previously shown, the average impact is typically less than 1%.

Figure 3-D showing decision table for PerformanceAnalysisData

The PerformanceAnalysisPlan table that I showed in my first post does not contain any indexes that I would recommend compressing:

Figure 3-E showing decision table for PerformanceAnalysisPlan

The PerformanceAnalysisTraceData table is an example of mixing compression types across indexes. The "% Savings" on PK_PerformanceAnalysisTraceData is similar for row and page and has high U compared to S. It does not compress very well, but is gigantic in size, which is why I recommend row. IX_MaxIDs is a tiny index, but has a decent amount of scans, so it may be helpful to performance to compress that data. IX_PerformanceAnalysisTraceData_Wide is well suited for page compression for the "% Savings" and minimal activity.

Figure 3-F showing decision table for PerformanceAnalysisTraceData

In my next post, I will provide the scripts for applying the recommended compression and some tips on how to execute them in your database.

Comments ( 5 )

      • Karl says:

        What perfmon counter did you use to measure average total memory usage?

      • Shahriar A. says:

        Many thanks for posting this set of articles. You discuss SQL Server compression in context of the SQL Sentry database but much of this information added a lot more detail to what I knew about SQL Server data compression. Thanks for sharing your knowledge.

      • Melissa Connors says:

        Hi Shahriar,

        Thanks so much for your comment! I'm glad to hear that you found the information in the posts useful.

        -Melissa

      • Anonymous says:

        Great post, Melissa. Very interesting findings!

        One question and one comment. First, did you encounter a specific wait statistic during your testing that shows wait time spent on compression activity? Just curious if such a beast exists. Second, and this is minor, but as a red/green colorblind person, I found the charts pretty hard to distinguish based on color alone, especially the first two columns.

        Thanks for this very good study!

        -Kev

      • Melissa Connors says:

        Hi Kev,

        I do not have the wait statistics information, but I will look into that and get back to you! :-)

        I'll keep the color suggestions in mind for the future as well and improve labeling where distinguishing between the colors on the chart is particularly important to understanding it.

        Thank you for your feedback!
        -Melissa

      Leave A Comment

      Your email address will not be published.

      This site uses Akismet to reduce spam. Learn how your comment data is processed.