Bad Habits Revival - SentryOne Team Blog

Bad Habits Revival

https://www.flickr.com/photos/badgreeb_records/6431633105A few years ago, I started a relatively thorough series of T-SQL "Bad Habits" – things I see people do, often, that can cause problems in various ways (not always related to performance). Over the past 6+ years I have posted a number of other posts, too, that promote best practices and try to squash bad habits. Skip to the index.

I've given dozens of presentations centered around these bad habits, and plenty more have cropped up in the meantime. So I thought I would jot a few more down, and provide a handy index to all of them. First, though, I want to explain my motivation for writing these posts, because I don't think it always comes across the right way.
 

I'm not trying to correct you, or make you feel like you are "doing it wrong."
I'm trying to help you be consistent, avoid issues, and set better examples.

In many cases, the people committing these quasi-sins are well aware of the situations where they can break, and know that they are currently safe. For example, many people will declare VARCHAR without length in this scenario:

SELECT CONVERT(VARCHAR, some_int_column) FROM ...

They can get away with not declaring the length for the VARCHAR here, because the default SQL Server applies in this case is 30, and no integer value could ever be longer than 11 digits. The problem is that leaving off the length can break in *other* scenarios, where the default is 1 (Microsoft basically laughed at me when I suggested they make this consistent). And "break" doesn't always mean "generate an error message," making it easy to discover during development, and fix long before it makes it to production. In this case, you can irrecoverably lose data, and not even know about it:

DECLARE @x VARCHAR = '123'; SELECT @x; -- result: 1

(That works the same way for input parameters to a procedure, by the way, but I can't demo that in one line.)

So my issue in many cases is not that people take advantage of the situations where they know it can't break – they know the risks in other scenarios. My issue is when those people put code in the codebase like that, or – worse – blog or answer questions on Database Administrators or Stack Overflow that implicitly advocate the bad habit. The people who see that code (junior developers/DBAs, future maintainers, kids fresh out of college) don't get all of the author's knowledge as a by-product, so they may not be aware that – while this tactic is okay in the specific case they're looking at (or maybe it's not!) – it can break severely in others.

Another aspect of this is consistency. It seems wrong to me to use the length above when you need to, but not bother when you don't. So why not just specify the length, always? An even better example is the lazy shorthand people use for functions like DATEPART, DATENAME and DATEDIFF. You might be surprised by this result (I won't spoil the fun; go ahead and try it):

SELECT DATEPART(Y, GETDATE());

If you can't use Y in this case, why use M or D in other cases? Why not just type out YEAR or MONTH or DAY, always? Not only do you avoid this inconsistency where sometimes you need to spell it out and sometimes you don't, your code also becomes much more self-documenting. Since you type the code once, and others may read it many, many times, I don't buy that the extra cost of spelling out the word (measured in microseconds) is a productivity killer. I have never heard of anyone getting to leave for the long weekend early because they shaved all that time off from not having to type those characters, but I have seen long weekends start late because people were troubleshooting some mess that was harder to debug because of shorthand.

I feel similarly about the INFORMATION_SCHEMA views. Since they are incomplete, aren't being updated for most new features, and Microsoft themselves recommend against them, I find it hard to justify using them in cases where you can (like getting just a list of columns from a table), but then having to use the catalog views when you need information the INFORMATION_SCHEMA views simply don't have (like partitions, or filtered indexes, or INCLUDE columns). Unless you need to write code that will work on multiple RDBMS platforms (in which case you can't use any of the SQL Server-specific features anyway), why not just use the catalog views always?

So, to summarize my long-windedness about T-SQL bad habits: be consistent, avoid issues, and set better examples. :-)  

 

THE INDEX

Here is an ongoing list of articles that I consider to be along these lines – either promoting best practices or eradicating bad habits; not all are explicitly framed as a "bad habit," but they do all represent in some way things I wish I observed less often. Some of my opinions are controversial, and many have evoked very passionate comment threads – so I recommend scrolling down for those, too.

Date Post
2017-01-26 Performance Surprises and Assumptions : GROUP BY vs. DISTINCT
2016-11-09 Can comments hamper stored procedure performance?
2016-11-02 #BackToBasics : CAST vs. CONVERT
2016-10-04 #BackToBasics : Why I use lower case for data type names (now)
2016-09-06 Pattern Matching : More Fun When I Was a Kid
2016-09-01 #BackToBasics : Naming Stored Procedures
2016-08-03 #BackToBasics : Great Debates : Unicode
2016-07-11 Paying Attention to Estimates
2016-06-01 #BackToBasics : An Updated "Kitchen Sink" Example
2016-04-11 Performance Surprises and Assumptions : DATEADD()
2016-04-06 #BackToBasics : Dating Responsibly
2016-02-03 #BackToBasics : The "Runaway" Query
2016-01-18 Subjectivity : Naming Standards
2016-01-06 #BackToBasics : Common Table Expressions (CTEs)
2015-11-10 T-SQL Tuesday #72 : Models Gone Wild!
2015-10-08 Maintaining a grouped running MAX (or MIN)
2015-08-07 Bad Habits : Being CarELesS about cAsE
2015-06-29 FORMAT() is nice and all, but…
2015-06-05 How not to call Hekaton natively-compiled stored procedures
2015-06-01 Protecting Yourself from SQL Injection in SQL Server – Part 2
2015-05-27 Protecting Yourself from SQL Injection in SQL Server – Part 1
2015-04-07 Bad Habits : Using MDF/LDF Files as "Backups"
2015-03-12 Best Practices : Properly referencing columns
2015-02-26 Should I use NOLOCK against #temp tables?
2015-02-19 Developers need to know "DBA stuff" – and vice-versa
2015-02-13 Fun with THROW : Avoid % and use semi-colons!
2015-01-06 Bad habits : Focusing only on disk space when choosing keys
2014-10-30 Bad habits : Counting rows the hard way
2014-10-22 Bad habits : Another case for semi-colons and schema prefix
2014-10-08 Bad habits : Using AttachDBFileName
2014-09-30 Bad habits : Clinging to old compatibility levels
2014-09-24 Bad habits : Putting NOLOCK everywhere
2014-09-22 Stop making SQL Server do your dirty work
2014-09-16 Bad habits : Looking for optimizations in all the wrong places
2014-09-09 Bad habits : Giving out the sa password
2014-09-04 Bad habits : Using (certain) metadata "helper" functions
2014-07-08 T-SQL Tuesday #56 : SQL Server Assumptions
2014-06-12 Dirty Secrets of the CASE Expression
2014-02-24 Avoid using NOLOCK on SQL Server UPDATE and DELETE statements
2014-01-23 For the last time, NO, you can't trust IDENT_CURRENT()
2013-09-26 Generate random integers without collisions
2013-12-19 Improve SQL Server Efficiency by Switching to INSTEAD OF Triggers
2013-10-17 Use Caution with SQL Server's MERGE statement
2013-09-05 Performance Surprises and Assumptions : DATEDIFF
2013-08-14 Performance Surprises and Assumptions : Arbitrary TOP 1
2013-07-26 Follow-up on Summer Performance Palooza 2013 (Habits & Practices Q&A)
2013-06-06 Don't just blindly create those 'missing' indexes!
2013-05-17 Another argument for stored procedures
2013-03-13 Break large delete operations into chunks
2013-01-18 Generate a set or sequence without loops – part 3
2013-01-17 Generate a set or sequence without loops – part 2
2013-01-16 Generate a set or sequence without loops – part 1
2012-10-15 Is the sp_ prefix still a no-no?
2012-09-20 What impact can different cursor options have?
2012-09-13 How much impact can a data type choice have?
2012-08-16 Splitting Strings : Now with less T-SQL
2012-04-30 Deciding between COALESCE and ISNULL in SQL Server
2012-04-24 Minimize SQL Server plan cache bloat
2012-02-27 Bad Habits to Kick : Believing everything you hear or read
2012-02-28 Checking for potential constraint violations before entering TRY / CATCH
2012-01-26 Bad Habits to Kick : Thinking a WHILE loop isn't a CURSOR
2012-01-23 Bad Habits to Kick : Using AS instead of = for column aliases
2011-12-13 T-SQL Tuesday #25 : T-SQL Tips 'n' Tricks
2011-11-03 The case against INFORMATION_SCHEMA views
2011-10-19 What do BETWEEN and the devil have in common?
2011-09-20 Bad Habits to Kick : Using shorthand with date/time operations
2011-09-17 Bad Habits to Kick : Using EXEC() instead of sp_executesql
2011-08-10 T-SQL Tuesday #21 : Crap Code
2011-07-12 T-SQL Tuesday #20 : T-SQL Best Practices
2011-01-26 Avoid External Dependencies in SQL Server Triggers
2010-12-29 Making a more reliable and flexible sp_MSforeachdb
2010-03-08 Bad habits to kick : ignoring I/O
2010-03-07 Bad habits to kick : creating the uber-view
2010-02-22 Bad habits to kick : using ancient copies of Books Online
2010-02-15 Bad habits to kick : inconsistent table aliasing
2010-02-12 Bad habits to kick : ignoring the principle of least privilege
2010-02-10 Bad habits to kick : blind SQL Server installs
2010-02-08 Bad habits to kick : putting an IDENTITY column on every table
2010-02-08 Bad habits to kick : relying on undocumented behavior
2010-01-14 When you don't follow your own "bad habits" advice…
2009-10-16 Bad habits to kick : mis-handling date / range queries
 
Webucator, a provider of SQL Server training, has produced a video based on this post.
2009-10-14 Bad habits to kick : using the visual designers
2009-10-14 Bad habits to kick : using alias types
2009-10-12 Bad habits to kick : abusing triggers
2009-10-12 Bad habits to kick : choosing the wrong data type
2009-10-11 Bad habits to kick : making assumptions about IDENTITY
2009-10-11 Bad habits to kick : inconsistent naming conventions
2009-10-11 Bad habits to kick : avoiding the schema prefix
2009-10-10 Bad habits to kick : using SELECT * / omitting the column list
2009-10-09 Bad habits to kick : declaring VARCHAR without (length)
2009-10-09 Bad habits to kick : using SELECT or RETURN instead of OUTPUT
2009-10-09 Bad habits to kick : using dashes and spaces in entity names
2009-10-08 Bad habits to kick : using old-style JOINs
2009-10-08 Bad habits to kick : using table aliases like (a, b, c) or (t1, t2, t3)
2009-10-07 Bad habits to kick : using loops to populate large tables
2009-10-06 Bad habits to kick : ORDER BY ordinal
2009-09-03 Ladies and gentlemen, start your semi-colons!
2008-10-30 My stored procedure "best practices" checklist
2008-03-20 Which to use: "<>" or "!="?

Comments ( 6 )

    • Alejandro Palacios says:

      Exellent post, thanks!

    • Teach Yourself SQL Server Performance Tuning (Dear SQL DBA Episode 12) - by Kendra Little says:

      […] Bertrand's Bad Habits Revival on the SQL Sentry blog (not all cause performance issues, you'll learn as you work through the […]

    • Bob St. Aubyn says:

      Hi Aaron – Greetings from Ohio! I actually met you at PASS 2013 in Charlotte (we had a spirited conversation about MERGE & deadlocks at some bar gathering). Fun stuff.

      Just wanted to let you know I just found this post and had to make sure you got proper kudos. Thanks a million for compiling this list! I think this should be a required stop for all frustrated DBA's (like myself) trying in vain to warn Developers of the evils of bad coding practices and preach the advantages of doing it right. This list will serve as a great starting template for putting together my own "best practices" documentation – something I've been meaning to do for a long time now.

      One of my frustrations with TSQL in general is that it's so easy to acquire bad habits. It's a snap to develop a poorly architected database full of bad code – one that works perfectly when first deployed – with no warning of the inevitable nightmare waiting for the unsuspecting DBA who has to support it as it accumulates data and user base down the road. Worse, managed code developers seem to be trained in such a way that encourages many of the bad practices you mention above, and discourages a mindset for writing efficient, set-based queries with reused, seek-heavy plans.

      The unfortunate fact of life at my current company is that the Dev's write most of the TSQL, and since coming on board I've been trying to turn the tide on rampant proliferation of what I call my "top 4": 1. Ad-hoc (TSQL queries embedded directly in application code). 2. Dynamic TSQL (SQL statement constructed on the fly). 3. Cursors (dev tendency to "loop" on everything). 4. DML triggers (the solution to every scenario where something has to happen as a result of data change).

      Here are a few others that are almost as prolific as these, that may even serve as inspiration to add to the above list:
      * – Wide, multi-column clustered keys (esp. on 100-million row+ tables; rapid fragmentation, page splits, potential huge cost in lost efficiency of all NC indexes and I/O's to maintain the NC's)
      * – Clustered key on a non-sequential value (self-explanatory, pretty much same as above)
      * – NC index created to optimize ONE QUERY (one of my favorites – this is how we end up with 8-10 NC indexes on 300-million row tables where several are largely redundant. Esp. in big databases, you have to tune for query *patterns*.)
      * – Refusal to create stored procedures – "My query is too simple, there's no performance gain!" or "I want to be able to see the query in my application code" (Nobody seems to buy the security advantages of sp's or the downside of filling the plan cache with single-use plans).
      * – Blocking or deadlocks? – just use NOLOCK! (you covered that above)
      * – Blocking or deadlocks? – we just need SNAPSHOT ISOLATION! (not a silver bullet to solve contention problems caused by bad coding and access habits)
      * – Explicit transactions declared on huge, long-running DML operations.
      * – Avoid cross-server queries! When you have to use them, THINK about the data and the options available to make them leaner.
      * – Sledge hammer approach to granting permissions ("I need to be able to create stored procedures, so I need to be in the db_owner role" or "I need to be able to deploy code on that instance, so I need sysadmin")
      * – In general, deploying TSQL to a 24/7 business-critical production database, with no idea or concern for what its execution plan looks like. I've often said that you can write a complex query 5 different ways and all will produce the correct result – but 1 will be most efficient. When you're talking about highly contentious resources in a database like that, shaving milliseconds IS WORTH IT!

      Jeesh, I could go on and on… Sorry for the long winded post. I just got a little charged when I found this page because it's so relevant to what's going on in my world right now. Keep up the good work, Aaron. Hope to see you again soon!

      -Bob

    • Kin says:

      Is it possible to share above valuable tips in a PDF format ? That would be awesome.

      Thanks,
      Kin

    • Aaron Bertrand says:

      Hi Kin, it's an interesting thought, however I don't own or even control the content at all of those destinations, so I'd have to be very careful about which articles made the eBook.

    Leave A Comment

    Your email address will not be published.