Bad habits : Putting NOLOCK everywhere - SQL Sentry

Bad habits : Putting NOLOCK everywhere

[See an index of all bad habits / best practices posts]

I went through the archive above, and was quite surprised to discover that I have never blogged explicitly about NOLOCK. Out on the forums, I typically refer to this hint as a "magic, pixie-dust turbo button." It may seem to make some queries faster, but at a significant cost that must be weighed. I'm not saying don't ever use the read uncommitted isolation level for any query; the "habit" I am talking about here is not that of using NOLOCK in an isolated scenario where the effects are known and deemed inconsequential, but rather, that of slapping NOLOCK on every table mentioned in every query in the entire workload.

NOLOCK Effects

What are the effects you need to worry about? Well, others, like Andrew Kelly, have described this before, Paul White digs quite deep into it, these search results are quite telling, and Kendra Little even has a video about it… but I'll point them out anyway:

  1. "Dirty read" – this is the one most people are aware of; you can read data that has not been committed, and could be rolled back some time after you've read it – meaning you've read data that never technically existed.
     
  2. Missing rows – because of the way an allocation scan works, other transactions could move data you haven't read yet to an earlier location in the chain that you've already read, or add a new page behind the scan, meaning you won't see it at all.
     
  3. Reading rows twice – similarly, data that you've already read could be moved to a later location in the chain, meaning you will read it twice.
     
  4. Reading multiple versions of the same row – when using READ UNCOMMITTED, you can get a version of a row that never existed; for example, where you see some columns that have been changed by concurrent users, but you don't see their changes reflected in all columns. This can even happen within a single column (see a great example from Paul White).
     
  5. Index corruption – surely you are not using NOLOCK in INSERT/UPDATE/DELETE statements, but if you are, you should be aware that this syntax is deprecated and that it can cause corruption, even in SQL Server 2014 RTM – see this tip for more information. Note that you should check for the hint in any views that you are trying to update, too.
     
  6. Read error – because the underlying data could be moved or deleted during your read, you could see this error:
    Msg 601, Level 12, State 1
    Could not continue scan with NOLOCK due to data movement.

    Back in SQL Server 2000, this bug was supposedly fixed (see KB #815008), but only under a trace flag (9134) – and several customers have reported that it can still happen, even with the flag, and even in SQL Server 2014.

Maybe these effects are okay to you. Maybe you're using NOLOCK in scenarios where 100% accuracy isn't crucial (a rough ballpark of the number of messages posted to your forum today, a monitoring tool collecting aggregate metrics from DMVs) or where it can't really hurt (pulling data from a static auxiliary table like a numbers or calendar table). But maybe you aren't aware of the potential issues listed above, or don't believe they can happen to you. Let me assure you: they can. Not convinced? There's more! Please read on.

Poor Naming Choice

It really is too bad that they named this hint NOLOCK. The semantics of read uncommitted don't actually mean "take zero locks" – they actually mean "take no shared locks." I have seen many people surprised to see rows in sys.dm_tran_locks for a query that is running under read uncommitted. Here is a quick example, where we can see Sch-S (schema stability) and other locks taken out for an object, even though NOLOCK is used (click to enlarge):

Sch-S locks showing even under read uncommittedSch-S locks showing even under read uncommitted

These aren't held for the duration of the transaction, but the locks are maintained while the query is running. What does this mean? Well, if you have lots of users running read-only queries under NOLOCK, you may still have issues "cutting in line" to make any changes to the table or its indexes, because you will be blocked by Sch-S. It is a good thing to be conscious of this issue – "NOLOCK" in this case isn't exactly working as advertised.

There are other scenarios, too. The query writer doesn't always have control over whether the objects they are referencing will obey the current session's isolation level semantics. I gave an example recently regarding the metadata helper functions like OBJECT_NAME() – a NOLOCK query can still be blocked because several of these functions enforce their own isolation level. You can also hit this if you reference your own functions or views with stronger hints, or only apply the hint to some of the tables involved in the query. To avoid the latter, and to make a later change to a better isolation level easier (more on this below), I always recommend using the session-level hint instead of the table-level WITH (NOLOCK):

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

If you insist on using the hint at the table level, you need to be very careful about the syntax. Compare the following, both of which are *allowed* syntax, but with markedly different behavior (click to enlarge):

Important differences in query syntax - alias vs. hintImportant differences in query syntax – alias vs. hint

On the left, session 68 just used FROM dbo.tablename NOLOCK – as you can see from the results, this was actually interpreted as a table alias, not a hint, and as a result, it took out shared and intent shared locks at the page and object level – which is not behaving the same way as the author likely expected. So please, always use the fully explicit form, FROM dbo.tablename [AS alias] WITH (NOLOCK).

Did you say a better isolation level?

You can use a different isolation level to avoid the issues listed at the beginning of this article, while still preventing readers from blocking writers and vice-versa. Once again, Kendra comes to the rescue with a great post explaining SNAPSHOT and READ COMMITTED SNAPSHOT:

RCSI in particular is a great way to get the performance of NOLOCK without sacrificing accuracy. Don't just go turn this feature on, though; there is no such thing as a free lunch. You'll want to note the following, and make sure that you have thoroughly tested all of these scenarios before making the change:

  • Sch-S locks still need to be taken even under RCSI.
  • Snapshot isolation levels use row versioning in tempdb, so you really need to test the impact there.
  • RCSI can't use efficient allocation order scans; you will see range scans instead.
  • Paul White (@SQL_Kiwi) has some great posts you should read in his blog series on isolation levels.
Other Alternatives

Since using the snapshot isolation levels is not, and should not be, a simple "switch and forget it" operation, you may want to look into other alternatives to NOLOCK. Since this usually involves read-heavy queries, you could consider letting users read a copy of the data, and your choices are limited only by the edition of SQL Server you're using and how stale the data is allowed to be. Some options off the top of my head:

Solution Granularity Currency of data Edition requirements Effort / maintenance
Availability Groups
(read-only secondaries)
Database Near real-time Enterprise Edition Medium
Database Mirroring
(snapshots)
Database Snapshot frequency Enterprise Edition High
Replication Object Near real-time Most editions High
Log shipping Database Log backup frequency Any edition Low
Application-level
data caching
Object Implementation dependent Any edition Medium

In a previous life, I have implemented something similar to the application-level data caching approach, but I replicated data, on a schedule within to multiple SQL Server Express instances, each sitting on an application server. This way the application had relatively current data, but it was static and local, so no issues with concurrent access, no conflicts between readers and writers, and no network round-trips. You can read more about this solution here and here.

I do have some work invested in a post about a poor man's secondary involving log shipping; I will update this post when that one is published.

Conclusion

I am not vehemently against NOLOCK – there are definitely use cases where you can "get away with it." I used it quite a bit earlier in my career, but looking back, mostly to avoid solving the real issue(s). Today I am just against it being used in every single query as an implicit rule. If you are using it everywhere, I hope that I have convinced you to reel that in a bit, and use it only in those scenarios where accuracy is not important – or at least where accuracy can be traded for other priorities. Better yet, use one of the alternatives I discuss above, though I do acknowledge that change takes time and sometimes a lot more.

[See an index of all bad habits / best practices posts]

Comments ( 24 )

                    • T-SQL Tuesday #83: Why leave well enough alone? – The Rest is Just Code says:

                      […] NOLOCK hints everywhere. It's even in newly-developed code for this application. I asked a developer why they were using it, given the availability of Read Committed Snapshot Isolation, but it seemed that they only knew of NOLOCK as the magic turbo button. […]

                    • Tor says:

                      All the examples I see show it is the table that is deleted from that has the nolock hint. What about a scenario like below where it is locking a second table? Could that potentially cause any issues…especially index corruptions, on TABLE_A. I do understand it might delete records from TABLE_A that were never committed in TABLE_B, but I'm not too concerned about that.

                      delete from TABLE_A
                      WHERE ID IN (SELECT ID from TABLE_B with(nolock))

                    • Aaron Bertrand says:

                      Yes, that is probably safe from the corruption issue, but as you acknowledge, it might not be worth it from a data integrity standpoint (just like every other operation involving NOLOCK).

                    • Jo Ma says:

                      I would say it is best to understand exactly how NOLOCK/READUNCOMMITED works and use it only where and when it is appropriate. Used properly it can be a powerful tool to save you and your customers time and pain. Used wrongly it could have the reverse effect, however the situations where it can cause issues are specific and well known. I prefer not to instill myself with any dogmas about whether doing some specifically in SQL Server is good or bad [habit], it is simply another tool in the arsenal which is there for me to learn to use it properly and in the right place. Here is an interesting little article that very easily explains what nolock does: http://sqlserverplanet.com/tsql/using-with-nolock. And here is a freaking complicated article that also explains it: https://www.mssqltips.com/sqlservertip/2470/understanding-the-sql-server-nolock-hint/ . If you want to have better understanding, don't be afraid to try it yourself in your environment. Calculate your risks, If you're too afraid of "dirty reads" have a plan for cleanup. BTW, did you know that the ability to support dirty reading is actually a database feature? They would not support this behavior if if it wasn't beneficial in many situations.

                    • Aaron Bertrand says:

                      I largely agree. Keep in mind, the thrust of my post was not "do not use NOLOCK ever" – it was "do not blindly put NOLOCK *everywhere*, and understand all of the things that can go wrong where you do use it."

                    • Mordechai Danielov says:

                      Hi, confused by what I see here – "surely you are not using NOLOCK in INSERT/UPDATE/DELETE statements"
                      as far as I know you can't even if you wanted to (code below executed on SQL2012 express)-

                      create table t (col1 char(10))
                      insert t with (nolock) VALUES('hello')
                      ——
                      or
                      ____
                      update t with (nolock) set col1 = 'bye'
                      ____
                      either one results in :

                      Msg 1065, Level 15, State 1, Line 15
                      The NOLOCK and READUNCOMMITTED lock hints are not allowed for target tables of INSERT, UPDATE, DELETE or MERGE statements.

                      yet, interestingly enough, BOL (https://technet.microsoft.com/en-us/library/ms187373(v=sql.110).aspx) says:
                      For UPDATE or DELETE statements: This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.

                      what to make of this?

                    • Aaron Bertrand says:

                      The 2012 deprecation chart shows the same thing, though both are obviously incorrect. The syntax has been deprecated since 2005, and discontinued three releases later, in 2012.

                      My point remains: You are on 2012, but many other readers are still on 2005, 2008, or 2008 R2, where the syntax is still valid and doesn't raise an error.

                    • Mordechai Danielov says:

                      2008 also doesn't allow it. (at least as of SP1), but I see where you going with this. Interesting that no one bothered to clean up BOL all this time.

                    • JimG says:

                      This is a great explanatory article, and thanks for the details. One question – a former employer of mine insisted on using the WITH (NOLOCK) everywhere, because, once upon a time, a real-time system was getting queries blocked by other concurrent users/updates. So, the edict came down, to avoid any OLTP delays, all queries must use WITH (NOLOCK) – it even became part of the architectural-review requirements!
                      What alternatives are there to letting real-time processes NOT get locked out (where response time is measured in a few seconds). We were told that we could not have the query "wait in line" behind any other process that may be doing updates.

                      Thanks again!

                    • Aaron Bertrand says:

                      Thoughts that come to mind are read committed snapshot, using a readable secondary in an availability group, or cruder methods of doing the same sort of thing where you're providing a reasonably up-to-date copy of the data that isn't fighting for system-wide write concurrency. If you want readers to be up to the second, that's expensive (RCSI takes a toll on tempdb, and AGs are tedious to configure and literally expensive); if you can handle a delay, then other methods can work well – I talk about a couple of these in the following articles:

                      Of course, for certain things, as long as you're okay with the potential for inaccuracy, NOLOCK is not the devil, but you do need to be aware of all the things that can go wrong. Because they will.

                    • Manuel says:

                      OMG – Please tell me how you got SSMS background and colors to look like that! That would save me so much eye strain at the end of the day if I can make SSMS look like the visual studio that the dev guys use instead of the constant white eye killing background! I would appreciate that immensely. Thanks ahead of time.

                    • Aaron Bertrand says:

                      Sadly I did this manually in SSMS Tools / Options:

                    • Aaron Bertrand says:

                      Manuel, also see:

                      /aaronbertrand/making-ssms-pretty-my-dark-theme/

                    • Kevin says:

                      Hi Aaron,
                      Thanks for the NOLOCK information. I believe this topic is a direct result of the SQL Sentry training session we had with you and found the information to be extremely valuable in improving our decision making process. I'm now working through your bad habits index to find other gems of SQL knowledge we can use. Thanks again.

                    • Aaron Bertrand says:

                      Hi Kevin,

                      Actually, no, it's one I've been meaning to write for a long time. In fact I was surprised to learn that I *hadn't* written about it before (usually it's the other way around – I start blogging about a topic and then find an old post of mine about the same topic).

                      Cheers,
                      Aaron

                    • Arthur says:

                      Hi Aaron,
                      I am curious why WITH (READPAST) table hint is not mentioned here as an alternative?

                    • Aaron Bertrand says:

                      Hi Arthur,

                      Not any particular reason, I just don't personally like it as an alternative, because it means that rows are not returned in the result simply because they are locked. This does not give me any confidence at all in the results (a little more confidence than NOLOCK, but not enough for most scenarios).

                    • retracement says:

                      Hi Aaron, I'd take issue with the table in Other Alternatives. Database Mirroring with Snapshot being Realtime? Typo surely? And even synchronous AG read-only secondaries cannot be considered real-time. They are close to real-time for sure, but the redo thread might still be slightly behind. Only the log commits can be considered synchronous. One reason why MS normally slip in the word "reporting" into their blurb about offloading read-only workloads :). The danger of course is using the replica data as a basis for transaction updates against the primary.

                    • Aaron Bertrand says:

                      You're right, none of these are real time, though you can get closer with some than others (I don't know how many degrees of adjectives I should use for "real-time"). :-)

                      And I agree, I am certainly not advocating that copies meant for reading/reporting access should be used as the basis for updates to the primary. I am suggesting these alternatives purely for read-only consumption.

                    • retracement says:

                      haha great reply :) Well I'm offering you "not quite real-time", "a bit behind" or "what did you expect! RTFM" :) Anyway thanks for writing this up. NOLOCK is a favourite bug-bear of mine too.

                    • @sql_handle says:

                      Excellent, very helpful!!
                      Do some dmvs have an implicit nolock? I've seen queries of proc cache with nolock included in the query… but while working with SQL Server 2014 "select * from Sys.dm_os_waiting_tasks" sometimes seemed to give a result set with some individual rows repeated as many logical cpus on the system.
                      I guess maybe that's a consequence of some type of lock partitioning rather than nolock?

                    • Aaron Bertrand says:

                      Hard to tell if there are any implicit NOLOCKs (I'm not sure they're necessary, to be honest, though several colleagues swear by explicitly using read uncommitted for all DMV queries). sys.dm_os_waiting_tasks is simply:

                      SELECT * FROM OpenRowSet(TABLE SYSWAITS)

                      As for your specific observation, won't you see multiple rows in that DMV for processes that have gone parallel (just like in sys.dm_exec_requests)? I really doubt those are rows that have been read multiple times – are they really 100% identical?

                    • @sql_handle says:

                      You're right – need more practice looking at parallel queries. :-) Was testing checkdb on server with 24 schedulers and maxdop 0. Was surprised to see over 400 rows returned from waiting tasks for the checkdb session_id. But its because each task in cxpacket wait was waiting/blocked by multiple other execution IDs of the same session. Makes sense, I was mistakenly thinking even threads waiting for cxpacket would show up a single time in sys.dm_os_waiting tasks. :-)

                      SELECT
                      session_id,
                      exec_context_id,
                      wait_duration_ms,
                      wait_type,
                      blocking_session_id,
                      blocking_exec_context_id
                      FROM sys.dm_os_waiting_tasks
                      WHERE session_id = 56
                      ——————————————————
                      10 35066 CXPACKET 56 42
                      10 35066 CXPACKET 56 47
                      10 35066 CXPACKET 56 41
                      10 35066 CXPACKET 56 46
                      10 35066 CXPACKET 56 45
                      10 35066 CXPACKET 56 44

                    Leave A Comment

                    Your email address will not be published.