SentryOne Team Blog (blogs.sentryone.com)

A "drop column" fix you'll want for SQL Server 2012 / 2014

Updated April 25th, 2016

If you ever drop columns, you probably already know that in order to truly reclaim the space, you need to rebuild the clustered index. Back in 2010, Michael Swart (@mjswart) treated this topic, focusing on the before and after wasted space aspects.

Unfortunately, wasted space is not the worst thing that can happen if you drop a column and don't immediately rebuild the clustered index (or heap). A defect has been discovered where subsequent updates can actually delete data in a completely different column, with no warning or error message – you might also call this undetectable data loss or even – gaspcorruption.

I don't have a repro script handy (and it would probably be irresponsible of me to provide one anyway). There are some further details in KB #3120595, but my take on what happens is basically that SQL Server simply points at the wrong part of the page, likely because offsets are wrong due to the old column slot that is still "kind of" there. This is reminiscent of a different issue involving dropped columns, KB #2504090, though that symptom is isolated to partitioned tables.

On April 2nd, the KB article added a script to check if you have any tables that are potentially vulnerable due to dropped columns. I've re-written it to conform to my conventions (most recently to not use a filter in the where clause for an outer joined table, which turns that into an inner join), and to generate ALTER commands that account for both clustered indexes and heaps:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
 
SELECT N'ALTER ' + CASE p.index_id
  WHEN 0 THEN N'TABLE ' ELSE N'INDEX ALL ON ' END
  + QUOTENAME(s.name) + N'.' + QUOTENAME(o.name) 
  + N' REBUILD WITH (MAXDOP = 1'
  + N',ONLINE = ON' -- depending on edition
  + N'); /* rows affected: ' + CONVERT(VARCHAR(11), SUM(p.[rows])) + N' */'
  FROM sys.objects AS o
  INNER JOIN sys.schemas AS s
  ON o.[schema_id] = s.[schema_id]
  INNER JOIN sys.partitions AS p
  ON o.[object_id] = p.[object_id]
  WHERE p.index_id IN (0,1)
    AND o.is_ms_shipped = 0
    AND EXISTS 
    (
      SELECT 1 
        FROM sys.system_internals_partition_columns AS pc
          WHERE pc.is_dropped = 1
          AND pc.[partition_id] = p.[partition_id]
    )
    AND EXISTS 
    (
      SELECT 1
        FROM sys.columns
        WHERE [object_id] = o.[object_id]
        AND system_type_id NOT IN (48,52,56,61,62,104,127,173)
    )
    GROUP BY p.index_id, s.name, o.name
    ORDER BY SUM(p.[rows]) DESC;

On April 25th, Microsoft's Pedro Lopes published a blog post with further details about repro scenarios, including the fact that they have only seen this in customers who have upgraded to 2012 or 2014 from 2005 (whether that be an in-place upgrade, backup/restore, or detach/attach).

Fixes have been published for SQL Server 2012 (SP2 and SP3) and for SQL Server 2014 (RTM and SP1). Until you're patched – and maybe even afterward – you should plan to rebuild tables after dropping columns as a matter of course and, most importantly, as part of the same maintenance operation to minimize the risk of corruption.

Major Version Service Pack Branch Cumulative Update with Fix Build #
SQL Server 2012 Service Pack 2 Cumulative Update #10 11.0.5644
Service Pack 3 Cumulative Update #2 11.0.6523
SQL Server 2014 RTM Cumulative Update #13 12.0.2568
Service Pack 1 Cumulative Update #6 12.0.4449

If you're on 2012 or 2014 and not on these supported branches, here's one more kick to motivate you to get on board the current service pack and most recent cumulative update train. In any case, regardless of version, drop columns with care, especially before you get patched.

Still no word if 2008 and 2008 R2 are affected, but even if they are, it is unlikely they will get a fix.

Comments ( 19 )

              • Chris Wood says:

                The KB has been updated yet again with the symptoms being further enhanced mentioning SQL2005 being upgraded to SQL2012 or SQL2014.

                The Index rebuild has now become a MUST after columns have been dropped.

              • Chris Wood says:

                We are now on revision 17 and the query has changed to:-

                SELECT DISTINCT OBJECT_NAME(sp.[object_id]) AS TableWithDroppedCols
                FROM sys.system_internals_partition_columns sipc1 WITH (NOLOCK)
                INNER JOIN sys.system_internals_partition_columns sipc2 WITH (NOLOCK)
                ON sipc1.[partition_id] = sipc2.[partition_id] AND
                sipc1.leaf_offset = sipc2.leaf_offset AND
                sipc1.leaf_offset < 0 AND
                sipc1.is_dropped < sipc2.is_dropped
                INNER JOIN sys.partitions sp WITH (NOLOCK) ON sipc1.[partition_id] = sp.[partition_id];
                GO

                With limited knowledge of the description of sys.system_internals_partion_columns is I cannot see how the new version of the Microsoft query will actually tell you anything.

                Chris

              • Chris Wood says:

                Erik,

                Which 2012 bug are you referring to that needs the maxdop?

                Chris

              • Monik says:

                Chris:

                FIX: Data corruption occurs in clustered index when you run online index rebuild in SQL Server 2012 or SQL Server 2014

                https://support.microsoft.com/en-us/kb/2969896

              • Chris Wood says:

                Monik,

                Unless you are on a build with the KB #3120595 fix included, and that will include the 2969896 fix, you stand the chance of the missing column data happening again if you don't rebuild clustered indexes after any variable length columns are dropped.

                Microsoft will only fix SQL2012 SP2 and SP3 and SQL2014 RTM and SP1 branches. Don't expect and SQL2012 RTM or SP1 fix even though they are definitely affected.

                Chris

              • ErikEJ says:

                The SQL 2014 SP1 CU6 release is now public: https://support.microsoft.com/en-us/kb/3120595
                And I suggest adding MAXDOP=1 to the rebuild script, to avoid being hit by another SQL 2012 bug

              • Donna says:

                Aaron,

                Do you mean for SQL Server 2014 Cumulative Update #5 for Service Pack 1? I can't find a Cumulative Update #6. And the build number for Cumulative Update #5 is 12.0.4439.1.

                Donna

              • Aaron Bertrand says:

                No, I mean Cumulative Update #6. It's not available yet, but based on their pub schedule, should be out by the end of next week. That build number notation is supposed to simply indicate that the build will be something larger than 4439

              • ErikEJ says:

                Will SQL 2012 SP1 never be patched?

              • Aaron Bertrand says:

                Erik, I'm not sure 2012 SP1 is affected (this regression could have been introduced in SP2), but even if it is, no, the RTM and SP1 branches are retired, so to obtain further engine fixes (that aren't critical security issues), you'll usually be instructed to move to a newer, fully supported service pack branch.

              • Chris Wood says:

                Erik,

                This was a design issue so it is in SP1 but as Aaron says you will need to move up to either SP2 or better SP3 to get this fix. They told me that it wouldn't be a hotfix for SQL2014 when they disclosed that it affected 2014.

                Chris

              • Andre Ranieri says:

                Is there a way to get on Microsoft's distribution list for support articles when they first come out?

              • Aaron Bertrand says:

                You can point your RSS at https://support.microsoft.com/en-us/gp/selectrss?target=rss

                Also, we keep these posts up to date (sometimes with a little help from Chris), often with added commentary and recommendations:

                /team-posts/latest-builds-sql-server-2012/
                /team-posts/latest-builds-sql-server-2014/
                /team-posts/latest-builds-sql-server-2016/

              • Chris Wood says:

                Andre,

                Two good sources are http://sqlserverbuilds.blogspot.ca/ They tend to tell you about hotfixes. Also the CU's come out about a week after Patch Tuesday and you can find details here https://support.microsoft.com/en-ca/kb/957826

                This is where I look.

                Chris

              • Chris Wood says:

                Thanks for getting this important message out to the community Aaron. It shouldn't take so long for Microsoft to give a clear and concise definition that will help all of us to understand both the problem and how to solve it.

                I have been told that DBCC CHECKDB would not help discover this.

                Chris

              • Aaron Bertrand says:

                Yes, I believe that, because the "corruption" does not happen until a vulnerable row is updated – so there will be no warning (and I don't know if CHECKDB would find any fault in the after state of the row, either).

              • Chris Wood says:

                LondonDBA,

                It was designed into 2012 and 2014. If you dropped variable-length columns without rebuilding the clustered index you might want to rebuild the clustered index and apply the patch to stop it happening. Nobody knows the exact situation that caused the problem but better be safe than sorry.

                Chris

              • LondonDBA says:

                Hi Aaron, I completely agree with you when you say the details in KB #3120595 are scant. We need to know exactly when the bug was introduced. Was it in SQL 2012 RTM or was it introduced in a Service Pack or Cumulative Update ? Our environment is probably similar to others in that we don't constantly patch all our sql servers to the very latest patch levels. We have different Service Packs and CUs installed. If Microsoft gave us much more information we would know how exposed we actually are. Without this information we're trying to reproduce the issue ourselves on our various versions.

                One part of me arranges with you when you say that it may be irresponsible for you to provide a repro script for this but a much bigger part of me thinks you would be doing the community a huge service if you did provide a very simple repro script. In the current situation where MS are providing next to nothing at least with your script we would be able to definitively show how exposed our estate is to this very serious issue.

                Thanks for the blog post.

              • Aaron Bertrand says:

                LondonDBA, I agree with Chris, just plan from now forward to always rebuild as part of any table maintenance that involves dropping columns.

                They have updated information in KB #3120595:

                • they added a little bit more info about the cause (but still not enough for a reliable repro IMHO)
                • they removed 2014 as an affected version (my guess, to return once they have a fix later this month)
                • they supplied a script to identify any existing tables that are potentially vulnerable

              Leave A Comment

              Your email address will not be published.

              This site uses Akismet to reduce spam. Learn how your comment data is processed.