Subjectivity : Naming Standards - SentryOne Team Blog

Subjectivity : Naming Standards

@BrentO

Reading: How I Write SQL, Part 1: Naming Conventions
http://launchbylunch.com/…/sql-naming-conventions/
Not a bad set of defaults.

Last week, I came across this post on database naming conventions by Sehrope Sarkuni. It was an interesting read, and there are definitely many items in his list that I agree with 100%.

Naming conventions are a very subjective thing, of course. Just like coding styles, they can often lead to anything from minor debates to full-blown religious arguments. Upper case vs. lower case keywords, tabs vs. spaces, and where to put commas come to mind. Heck, these same people even get into sub-arguments about whether the terms "upper case" and "lower case" should use hyphens, spaces, or neither. I've long been a proponent of not caring about which naming standards you use, but I do find it very important that your standards follow these three basic rules:

  1. The conventions make sense. You should be able to argue why the chosen convention is better than an alternative, and it can't just be because you like it better. This doesn't mean you have to win that argument, just that you should be arguing for something tangible.
     
  2. The entire team is on board. You should all agree on a standard before implementation, and any changes you make over time should be by committee.
     
  3. You don't make exceptions. Even if you're a one-person team, if you're going to bother having a standard, it needs to be used consistently. It's amazing how quickly exceptions can become the new rules.

That said, that doesn't mean I don't have opinions. :-) There were a few things in Sehrope's article that make a little less sense to me, and I'm prepared to argue why – particularly where the scope is limited to SQL Server. Of course, if your conventions need to cover multiple platforms and not just SQL Server, then some of my arguments may carry less weight.

TL ; DR VERSION

Since I know some of you like to minimize reading and scrolling, I'll sum up my counter-points to the post:

  • In SQL Server, identifiers do not need to be all lower case to avoid quoted identifiers, and hence snake case is unnecessary, too. Snake case also introduces issues for some of Sehrope's other conventions.
  • A table is a set of things, and therefore should be named plural rather than singular – even if, sometimes, a table might only have one row.
  • Single-column keys should not be named id – if it's a CustomerID, it should be called CustomerID always, not most of the time.
  • It's not practical or necessary to always include both table names, all column names, and the type of constraint in a foreign key constraint name.
  • It's not practical or necessary to always include the table name and the type of constraint in a check constraint name.
  • Some of Sehrope's points conflict with each other, such as the use of namespace prefixes.

Read on for more details. I would have commented on the page directly, but it's an older post, I'm not sure if comments are moderated properly, and I am certain I would exceed the max length of a comment anyway.

"Identifiers should use lower and snake case"

Sehrope argues that identifiers should always be in all lower case, and visual separation should be achieved through underscores ("snake case"). The reason for lower case is that, on some platforms, mixed or upper case names require quoted identifiers. This is not true in SQL Server, so forcing lower case can actually hamper readability. Subjective, sure, but I don't agree that a junction table like book_authors is more readable, or easier to type, than BookAuthors. The underscores add some space between the words, which can aid readability for some, but it can work against you when reading in a fixed-width font (it can look like two separate words) or when the identifier is in documentation and underlined (the underscore disappears).

Mixed case does require some care when coding, though. On case insensitive systems, mistakes are easy to miss, especially if names aren't provided through tools like IntelliSense (which takes the case directly from the metadata). And there will be serious problems if those mistakes make it through to a deployment onto a case sensitive system.

I do have one exception where I like snake case: procedure names. Unlike entities, most stored procedures have two very different parts: an object and a verb. My preference is object_verb, e.g. BookAuthors_Update, so that I can immediately identify the entity the procedure(s) act on, and can then move on to the action they take. This pays off best in sorted lists, such as Object Explorer, where I am often scanning for the stored procedures that involve BookAuthors. Some people like naming conventions like verb_object, e.g. Update_BookAuthors, but then that makes all the stored procedures that perform updates on any entity sort together, which is rarely useful.

"Table Names Should be Singular"

By definition, a table is a set of things, even if in some cases that set only has one member. And in all of the apps I've worked with since the 1990s, that scenario is pretty rare, never mind permanent. It's unlikely there are many convincing use cases where a table is meant to only have a single row. So, in my opinion, as long as you can avoid getting hung up on how to pluralize an entity name, table names should be plural. This is because when you speak about a set of things naturally, you use the plural form. When you come home with a bag full of stuff from the grocery store, it is a bag of groceries, not a bag of grocery.

Sehrope debates whether a table – where each row represents a person – should be called Persons or People. I would probably go with the irregular plural People, or something more specific, like Members, Users, or Customers. If the goal was to avoid plurals that end with s, then you could go with Membership, Staff, or Userbase. Celko would likely opt for Personnel, and why not? It's what ISO once used as an example. From a former version of ISO 11179:

Tables are Collections of Entities, and follow Collection naming guidelines. Ideally, a collective name is used: eg., Personnel. Plural is also correct: Employees. Incorrect names include: Employee, tblEmployee, and EmployeeTable.

"Single-Column Keys Should Be Called id"

This is probably the convention I feel strongest about. Contrary to Sehrope's advice, I think that, when possible, an entity should be named the same thing throughout the schema. Yes, this means that you're using a prefix as a sort of namespace, but this is happening in every other related table anyway. This goes back to my bullets about consistency and exceptions. No matter how convoluted your joins and aliases are, you can always tell which id is being referenced in a query because it will be named explicitly.

Having a full name in the primary table (say, Customers.CustomerID instead of Customers.id) also makes it much easier to find all the related instances. If you have the quandary above about People vs. Persons and the 15 other ways such a table might be named, and you just named the PK id, what column name are you looking for in the rest of the metadata? If using id means you don't have to worry about what you would actually call that name if you wanted a full reference (like PeopleID or PersonID), this doesn't really buy you anything except that it allows you to slightly defer that decision until you create the first related table, because that column can't simply be called id.

"Foreign Key Names Should Include All Column Names"

Sehrope suggests that the name of a foreign key constraint should be self-documenting, by including all of the column names in the constraint name. The example he used was where the team_member table referenced the team table, using the id column, and the foreign key name he preferred was team_member_team_id_fkey. On quick glance you might see a problem: This makes snake-cased object names a much harder sell, because now the underscores serve multiple purposes – it can be hard to distinguish when they are being used to separate tables from each other, are just natural parts of table names or, in the case of a multi-column key, to separate columns from each other, or are just natural parts of column names. (This is more about strengthening the argument against snake case in entity names, not arguing against snake case in constraint names.)

Fictitiously, let's say we love snake case, and have tables foo_team and bar_group, with a multi-column key on (foo_id, bar_group_team_name). Under the proposed rule, a foreign key constraint relating the two tables might then be named bar_group_foo_team_foo_id_bar_group_team_name_fkey. While I agree that single-column keys are the majority by a great margin, the potential for this kind of mess is just one of those exceptions I can't get behind. Further to not being able to easily parse and understand that name, you can also get into issues if you have particularly long table and column names – identifier rules in SQL Server dictate that you're capped at 128 characters.

I also find the _fkey suffix unnecessary (and violates a major section title in Sehrope's article: "Prefixes and Suffixes (are bad)"). If you're researching the schema, you get foreign keys from very specific metadata views, so there is no confusion about what type of constraint you are looking at. And when you are troubleshooting an error message in real time:

Msg 547, Level 16, State 0
The INSERT statement conflicted with the FOREIGN KEY constraint "bar_group_foo_team_foo_id_bar_group_team_name_fkey". The conflict occurred in database "demo", table "dbo.foo_team".
The statement has been terminated.

There's nothing useful in that error message that you couldn't also have determined if the constraint were simply named bar_group_references_foo_team.

"Index Names Should Include all Column Names"

Similar to foreign keys, the author argues that indexes should be self-documenting by including the names of all of the columns present in the index. But also as with foreign keys, this can become a real tough thing to pull off, given the 128-character limitation (and the issues with snake case). For specific, narrow indexes you probably won't run into any problems, but in many scenarios (especially reporting systems), this will be much more difficult.

In addition, to make an index name fully self-documenting, we're not usually just talking about a generic list of columns. There are many other attributes of an index that I would have to go look up anyway, and often these things are more important than just the list of columns. A quick rundown off the cuff:

  • Is a column is in the key or in the include list?
  • Is each column in the key ascending or descending?
  • Is the index clustered or non-clustered?
  • Is the index unique? (Related: Is it a unique constraint implemented as an index?)
  • Does the index have any filter predicates?
  • Does the index contain any LOB columns?
  • Are the statistics for the index up to date?
  • Is it a special type of index, such as XML, Geography/Geometry, or ColumnStore?
  • Is the index on the primary filegroup?
  • Is the index partition-aligned?
  • For in-memory indexes, is it a hash index, and if so, what is the bucket size?

I'm not suggesting that I would want all of those things represented in the index name, because then we'd need a heck of a lot more than 128 characters. I'm just demonstrating that if I'm doing any kind of performance tuning or index analysis, having slightly faster access to the column names doesn't prevent me from having to go and do a bunch of research about the index anyway. This would probably work fine in the simple, narrow index case (in which case maybe it's okay for the index name to indicate the leading key column), but breaks down quickly when any of these other factors come into play.

Like with foreign keys, I find the ix identifier in the name unnecessary. I'm trying to envision a scenario when I'm looking at an entity name and I don't already know it's an index – certainly not when I'm looking at an execution plan or analyzing the indexes on a table.

Inconsistencies

This may seem a bit nit-picky, but the author talks about using id instead of, say, PersonID, in order to avoid using a prefix as a sort of namespace; he also suggests to avoid using all but the most common abbreviations. Then the check constraint he shows toward the end of the post is named person_ck_email_lower_case. Why ck instead of check? I'm not sure how common ck is as an abbreviation, nor am I sure that either ck or check is necessary (for the same reasons fkey is not necessary in a foreign key constraint name, and ix is not necessary in an index name). If a constraint violation occurs, it's going to be pretty obvious what type of violation happened, and it also makes the table reference redundant because the error message will state which table had the violation. Here is an example of a check constraint violation:

Msg 547, Level 16, State 0
The INSERT statement conflicted with the CHECK constraint "person_ck_email_lower_case". The conflict occurred in database "demo", table "dbo.person", column 'email'.
The statement has been terminated.

So, having the table name in the constraint name is redundant, and doesn't add any value during troubleshooting. He repeats this namespace prefix in index names (e.g. person_ix_first_name_last_name) – here, too, the leading table name is redundant; I can't picture a scenario where I'll be looking at an execution plan and not know which table an index belongs to (these things are paired together intentionally in both the XML and the graphical plans). Here's an example using SQL Sentry Plan Explorer:

Redundancy in XML and graphical plan Redundancy in XML and graphical plan

Unless I am being intentionally obtuse, there's no way for me to not know which table belongs to that index.

CONCLUSION

I don't expect you to agree with all of my points here; in fact I'd be kind of disappointed if you did. Also, I hope this isn't interpreted as a slam on Sehrope's post, which I really think is a valuable read no matter what platform you use. This can be a touchy subject with no right answers, and with a lot of opinion and bias inevitably shaping decisions. I just wanted to illustrate the thinking behind some of the naming standards choices I make.

Comments ( 46 )

                            • Solomon Rutzky says:

                              Yes, PascalCase is much preferred (not camelCase, which starts with a lower-case letter). And I completely agree with separating "object" and "verb" portions of Stored Procedure / Function names with an underscore, but would characterize that as separating two PascalCase terms — kinda like a namespace — rather than being "snake" case (meaning: the rule for no "snake" stands without any exceptions ;-)

                              Agreed that there should never be an "id" or "ID" field in a table as it is meaningless (not in terms of the object, but in terms of using it in the larger context of queries with multiple tables) and confusing. To address @Thomas's point: aliases should be used in any case. But allowing any key fields (whether single or composite) to have simple names that will show up in other tables violates one of my rules: A field name must not change if it is duplicated in another table to be a Foreign Key. If your PK field is always "ID", then you have to change it to "CustomerID" in your "Orders" table, etc. That just makes the data model harder to work with. CustomerID is CustomerID no matter what table it is in. The only time the name changes slightly is self-referencing FKs, or if for some reason the FK field shows up multiple times in the related table (User.UserID in Orders could be, for example, CreatedUserID and ModifiedUserID).

                              And now the fun part: Singular vs Plurals ;-)
                              Plural names "sound" more natural as a table is a set of something, but I, at least nowadays, side on the Singular side of table naming, but for a more pragmatic reason that I do not believe has been mentioned, and relates to the previous point of using "CustomerID" instead of just "ID": predictability so that things can be done programmatically. I have had to write too many scripts that do data and/or schema changes on some/most/all tables (without being able to query the DB for schema info) and it is very easy to script a pattern like {TableName}ID, whereas different pluralization rules ("s", "es", "ies", etc) make such scripting nearly impossible. True, such a need does not come along terribly frequently, but when it does, having a standard of "if there is an IDENTITY field, it will be {TableName}ID" really helps. And "CustomersID" just doesn't work. But again, it still feels odd / sounds wrong to name a table using the singular form, but the benefit has been well worth that minor discomfort.

                            • Aaron Bertrand says:

                              Your reason for singular doesn't really fly – if you're writing scripts against multiple tables, you can always join against sys.identity_columns or a more elaborate query against sys.indexes/sys.index_columns. This is far better than assuming that the table that holds the IDENTITY column called CustomerID must be named Customer. I've seen systems with tables like Personnel having a key column EmployeeID, for example. Since most of us spend a lot more time working with tables and queries directly than we do on scripting and automation that rely on assumptions about naming conventions, I still contend that plural is better.

                            • Solomon Rutzky says:

                              As I said, "without being able to query the DB for schema info". Also, the idea was not to work it backwards from the ID field's name to get the table name, but to know the ID field's name based on knowing the table name. And we both have seen systems with lots of whacky things being done, so a `Personnel` table with a key column named `EmployeeID` is just another example of something that should probably not have been done. We are talking here about what each of us prefers to do when we get to do what we prefer. And I did admit that it was not very often that such scripting would be done. But, even in the day to day working with something like 1000 – 2000 tables, it was even easier on writing queries to just know that `TableName`, if it had an `IDENTITY` field, would have `TableNameID`. It saved steps (more so than typing) via consistency. Again, this was purely pragmatic, and if not for that reason, I would completely agree with you on using the plural form (so much that when I am creating new tables I still sometimes use the plural form when first typing it out).

                              And please keep in mind, I am not trying to convince you to change your mind. I am just sharing the reason why I chose this path. This reasoning might resonate with some, and it might be absolute poppycock to others (whatever that is ;-). All I'm saying is that both ways have their pros and cons, and I've done both, and from those experiences, I now pick the one where, for me at least, the pros outweigh the cons.

                            • Joe says:

                              Sorry if this has already been debated. I didn't read all the previous comments.

                              "… it can't just be because you like it better"

                              Actually, it really is just that, because the criteria for evaluating the conventions are all subjective.

                              Some people find, for example, that underscores add readability, and I'm sure they're right. For them. Personally, I think the underscore key should be systematically removed from the keyboard of every programmer on the planet. I despise them with unbridled passion because they're such a horrible pace-killer to type. I can type prose at about 80 WPM. An underscore for me is like a stop sign in the middle of a stretch of freeway. Others who aren't good touch typists couldn't care less about that.

                              I've been watching styling convention debates for, literally, decades, and have only seen one that made any sense at all. We just went through a laundry list of all the points and the team voted them up or down with minimal discussion. We agreed on most things anyway because we'd mostly been trained the same way. For the rest, we all won some and lost some. The only thing I ever remember discussing where minds were changed was on the use of Hungarian for variable data types. Most people picked up Hungarian because MS used it in examples. So did I. Then I started writing without it (I kept scope modifiers) and loved it. In the discussion, I pointed out several examples where people, most of whom were there, had modified a variable data type and not changed the Hungarian tag. We agreed to try coding without it on a trial basis. Not one person ever suggested going back. Win for me. Others I lost, most notably the vote in favor of camel case, which I hate almost as much as the underscore.

                              The point of all that was that we voted as a team, and most people got mostly what they wanted. In the cases where there was disagreement, the majority opinion won the day. As it should.

                            • Aaron Bertrand says:

                              What I meant by that is if you get into discussions about which convention to use, and you disagree, you should be prepared to offer more than just "I like camel case better." I suppose if you have an odd number of people voting, a simple vote works, but if the number of people involved is even, you could still end up in a tie. Then what?

                            • Thomas says:

                              – I agree with Sehrope that you should not include the table name in column names such as with "CustomerId". Having just the column name is never enough. Yes, it does require developers alias their tables well. Having "A.Id" may definitely cause confusion. It also enforces that this value doesn't mean anything; it's purely a surrogate value. Also, what happens when you have two tables with the same name in different schemas?
                              – Having all columns in the FK name is absurd. We name our FKs with: "FK_current table schema_current table name_referenced table schema_referenced table name". If there are two references to the same parent table, then we'll include the name of the referencing column. Any more than that is too wordy. Admittedly, the "FK" is probably superfluous.
                              – The one advantage of using "IX_" is to separate it from "UC_" which is what we use on unique constraints. Including all columns (as well as 'include' columns?) would be way too wordy.
                              – No one talks about *why* people make things all upper or lower case. The main reason was monochrome editors. However, those days are long, long gone. Everyone has color coding now and readability is more important IMO. Words all in the same case are harder to read IMO than using PascalCase. Even 'snake case' is harder to read than PascalCase IMO.
                              – I too was taught that tables are sets of things and should be plural but must admit to having trouble at times with making things plural. Still, I think it is beneficial for other developers to understand they are dealing with a set.
                              – I really hate abbreviations unless they are *universally* known. E.g. VAT or SSN. Using "StateCd" is unnecessarily obtuse and terse when "StateCode" is much clearer. We came across this recently with "GL". That "L" next to the G is hard to read and "GeneralLedger" is clearer (but now has two "L"s next to each other).

                            • Dave Bennett says:

                              I have to say I land firmly on the singular table name side of the fence, for what I think is a very simple point.
                              What additional value is gained by using a plural?
                              You have stated, tables nearly always contain multiple rows. This is understood, so while I agree that superfluous abbreviation is a bad idea, why add additional characters to a table name that have no value. What is the value of adding an "s" or other plural suffix to an object that we already know will have multiple members. You may think that trivial, but it really does add up when you think of all the times you will be typing/using those superfluous characters in everything (constraints, indexes, procedures, select statements, etc.) that then uses those table names.
                              Bottom line, I see no value add in the plural, but a clear cost if it is used.

                            • Aaron Bertrand says:

                              @Dave, I don't agree with you that typing an extra character has a "clear cost"… do you also argue for leaving out vowels because Dprtmnt is faster to type than Department? Do you also argue for typing DATEDIFF(D, ... because it takes so much longer to type DATEDIFF(DAY, ...? Sometimes there are good reasons to be explicit, self-documenting, and properly descriptive – even if typing an extra character or two might convince you that it makes you less productive.

                            • Dave Bennett says:

                              @Aaron,
                              I you didn't address my main point: "What value is gained from using plurals?"
                              I don't see the value, and I do see the cost.
                              Eliminating vowels (or any other type of abbreviation) introduces ambiguity where none existed with full names. The same cannot be said for plurals.

                            • Aaron Bertrand says:

                              @Dave, the gain for me is that I can now talk about the set of customers naturally. It is a collection of Customers, not a collection of Customer. I think about set-based queries by extension of the conversation and documentation. I look at a table called Customers and I naturally think about operations that I'm going to perform against that table and I lean toward set-based queries like "an update of all of the customers meeting such-and-such criteria." With a singular table name, I tend to think about the individual operation I need to perform on each customer. While this conceptual mindset might not matter in 4GL, it can be a devastating difference in relational queries.

                              Again, I am in line with the way ISO described it – it is a collection of entities, and naming it using a plural is therefore correct. I don't need any additional benefits to justify the cost of typing one extra character to accurately describe it as such. And I'm not trying to convince you to change your mind – if you don't see the value of deeming a table a set, keep using singular names. But personally – and I will admit I do get hung up on semantics more so than character count – I find value in naming it accurately.

                            • Alex Friedman says:

                              Great points!
                              A good reason to use prefixes for index naming is for finding their usage in hints — it makes searching the metadata a piece of cake instead of a nightmare.

                            • Aaron Bertrand says:

                              @Alex, I could be convinced, though thankfully in most of the environments I deal with, index hints are extremely rare or simply non-existent. They're dangerous for reasons far more important than ease of searching the plan cache.

                            • Steve Jones says:

                              I like the post, and agree with your comments for the most part. The one place I'll disagree is for indexing. You wrote in a comment that you'd do firstname_lastname for the index. While I agree that I don't need to know the table name, there could be collisions among tables. I might want a firstname_lastname_nonclustered_unique index on two tables. In that case, I would have a collision.

                              My preference for indexes is to have the parent object included, just to prevent collisions. I have been in the habit of CI or NCI in there as clustered or nonclustered, but with tooling today, there's no reason to avoid using long names.

                            • Aaron Bertrand says:

                              Okay, but just to be clear, you won't have collisions – indexes are not one of the objects that need to be unique per database, just unique per object. So you can do this no problem:

                              USE tempdb;
                              GO
                              CREATE TABLE dbo.A(id INT, INDEX x(id));
                              CREATE TABLE dbo.B(id INT, INDEX x(id));
                              CREATE TABLE dbo.C(id INT, INDEX x(id));
                              CREATE TABLE dbo.D(id INT);
                              CREATE CLUSTERED INDEX x ON dbo.D(id);
                              DROP TABLE dbo.A, dbo.B, dbo.C, dbo.D;

                              I don't have a problem with you wanting to use additional qualifiers on the index – use which bits make sense and are helpful in troubleshooting and tuning. But the length of names are still limited, and you can't include everything – the table name is redundant to me, because as I said, I'm never looking at an index and wondering which table it came from.

                            • Scott says:

                              I haven't run into that problem yet with 128 characters. In some of my reporting databases I may have several indexes that the optimizer can choose from. To keep I also have a set of jobs that regularly look for unused indexes to keep them under control.

                            • Aaron Bertrand says:

                              @Scott Right, I don't suspect many have run into that issue, my point was simply that you can't stuff an index name with *all* of the information that would be useful about that index – even just listing all of the column names as Sehrope suggests might not be possible.

                            • Scott says:

                              Ahh, the great naming convention debate. I have followed Celko's advice for years, especially since for the most part it agreed with my own thoughts. With auto complete functionality built in to our tools, why not use descriptive names that have real meaning. I have created scripts that manage all of my primary key, defaults, check constraints, indexes and foreign key names. They tend to be verbose since I also use schema naming to group table names into functional groups. Using the scripts remove most of the burden from naming objects.

                            • Drew says:

                              I agree with most of the suggestions made in the article. What is most important is to have a naming convention for all the database objects and to adhere to the naming convention for all development. There is no "perfect" or absolute naming convention that must/should be followed as each business has unique application requirements and a "one size fits all" method generally will fail over time. My choice of database object naming convention is generally dictated by what works best for the support team and myself when something breaks at 3 AM and a "bleary eyed" support individual has to figure out the root cause as quickly as possible and prepare a fix.

                            • Yves says:

                              I disagree with always using plural form when naming a table. It is assumed that a table will contain multiple rows, otherwise it wouldn't be called a table. So singular form is used by default unless a row represents a collection of objects, which is rare (s rare that I can't remember any such examples right now). When a row reprsents multiple objects I use plural form.

                            • Aaron Bertrand says:

                              Yves, this doesn't really explain why a table with multiple rows should be named singular. When you go to the grocery store, do you come home with bags of groceries or bags of grocery? Also, it's funny that you bring up the point you make about assuming a table contains multiple rows, because one of the most common arguments I hear for naming it singular is "because what about the case when the table only has one row!?!?"

                            • Dan says:

                              Great post on a very interesting topic. I have always struggled with naming conventions, in particular the dilemma if table names should be singular or plural. Over the years, I have seen examples (like a table named ProcessStatuses), which led me to stay with singular. It is easier to come out with singular names, than plural ones. In addition, I always found singular table names to provide more clarity and simplicity.

                            • Aaron Bertrand says:

                              See, and I would go with ProcessStates.

                            • Dan says:

                              Agree… Definitely better than ProcessStatuses! Great Post!

                            • Adolfo Socorro says:

                              Aaron, thanks for sharing your views. From my data modeling formation, I like singular names for tables, but also because of the irregularity of plurals, which becomes a more serious issue in diverse teams in which not everybody has English as the first language.

                              I also like prefixes, like "tb" and "vw", for example. One reason is to avoid conflicts with reserved words (e.g., Order in a sales database or Event in a planning database). Another is to easily identify whether a query is using a table or a view, rather than trying to find, for example, "Customers" in the list of tables and in the list of views. I get that this is inconvenient if for some reason a table is converted to a view or vice versa, but in my experience this is not very common.

                            • Aaron Bertrand says:

                              Well, I think you'd avoid the necessity for the prefix in those rare cases if you'd named the table Orders instead of Order in the first place. :-)

                            • Sean Redmond says:

                              I like Brent's summary: be consistent, be in agreement and make sense.

                              We use a convention that I find very useful. All tables have a unique abbreviation (so People becomes People_PE). This abbreviation is used for primary keys, foreign keys, indexes, naming etc. All non-composite primary key columns take the form PK_ followed by the tabular abbreviation. Likewise, all foreign keys take the form FK_ + tabular abbreviation.

                            • Andrew says:

                              This one comment rings true from my years experiences:
                              " You should all agree on a standard before implementation, and any changes you make over time should be by committee. "

                              I have worked in organizations that followed a coding naming convention (not necessarily the standards you proposed above) and for organizations where no-one saw the need for any such standard, as these were considered a hindrance to productivity.

                              My personal experience has been that those organizations that followed a coding naming convention (and if they have a naming convention, they generally also had a coding review standard in place) displayed greater productivity, even on a daily basis, which in turn made me 'feel' (because we are humans, and our gut feelings are important to us) that I had a much more rewarding work life.

                              The organizations that did not follow any sort of coding naming convention, often also had little to no coding review standards. This lead to a continual "reinvention of the wheel' and often left individuals working as if they were in a vacuum. No one does their best work in a Vacuum. Needless to say I did not go home at the end of my days feeling that I had contributed towards something bigger than myself, but rather that I was simply a mouse in a cage spinning a wheel… and going nowhere.

                              But these comments are just my personal opinion, formed from my personal experiences.

                            • Steve Mangiameli says:

                              Naming conventions are cool…in a new shop. But what do you do in a shop supporting multiple business lines, multiple teams of devs, that's been around for a long time and never, ever set a naming standard? Or better yet you inherit a new business's code through a merger? Is it worth the time or the effort or the risk to standardize something like that?

                              These are rhetorical and as subjective and situational as the rest of the post. I agree with naming standards, but I have yet to be in a shop where there is anything posted, let alone enforced during code review. Mostly, I agree with them as far as I can control them so that when it comes time to support the stuff I've written or designed, I don't have to think about it as much and hopefully, it is easy enough to follow for those coming in behind me.

                            • Aaron Bertrand says:

                              Well, you said it, it's subjective and up to you. Is there enough confusion in the current chaos to make it worth introducing risk changing names? I would say not very likely, but I don't know.

                              That said, even at old companies, there are new projects. Establishing a naming standard for projects going forward is better, IMHO, than just letting chaos continue to reign. YMMV.

                            • Justin says:

                              90% plus in agreement with your perspectives. I've actually written several naming convention documents in my time and the most difficult part is logical consistency.

                              Regarding table names, I am forever bound to using singular names from my days as a data modeler. A (logical) entity is a set of attributes describing one thing, not a set of things. The 5, 15, 40 attributes of the Customer entity describe one Customer. I carry that convention into the database design where each row in a table is made up of columns that describe a single Customer, whether we have 1 or 10 million customers.

                              Besides, the vagaries of English pluralization rules irritate my visual sensibilities.

                              That being said, I'm not about to go to war over it, as long as either convention is followed consistently.

                            • Aaron Bertrand says:

                              But you're still thinking about exactly one row as an object in and of itself. Think about it a different way (but still as a data modeler, because that's how I think in the same context): the table contains a set of customers, whether we maintain 1 or 400 attributes about each of those customers. I am not talking about the attributes as the "set" but rather the customers.

                              We get around this in a lot of cases using common sense – a calendar table, for example, is called dbo.Calendar, not dbo.Dates or, less likely, dbo.Date. A table of numbers is called dbo.Numbers, not dbo.Number. A history of log data is called dbo.LogHistory, not dbo.HistoricalLogRecords or dbo.HistoricalLogRecord.

                            • Madhivanan says:

                              Aaron, you specified that you would like to use camel case naming for tables and happy to use lower and snake case for procedures. However I would say that even for tables, it is more readable to use lower snake case for lengthy object names. ex CustomerMembershipTypes Vs customer_membership_types. However it always depends on one's own preference.

                            • Aaron Bertrand says:

                              Yes, it is all about preference (hence the title). Just to clarify, I am happy to use snake case for stored procedure names, but these are used to separate entire concepts (entity vs. action), not just to separate words. So I would use OrderDetails_Update, not Order_Details_Update. And at no point did I advocate all lowercase in any of these scenarios.

                            • Shubert says:

                              I use .Net naming conventions. Table names in SQL are odd in that they are both the set of items and also the item itself. So whether you use a plural or singular name it will look strange somewhere. I solved this (for my eye, anyway) by using singular + 'Table'. So "select MemberTable.FirstName from MemberTable where MemberTable.LastName = 'Smith'"

                              Likewise I suffix views with 'View'. But I know SQL people will never accept this heresy…

                            • Aaron Bertrand says:

                              Yeah that will have a hard time flying in most shops, primarily because of how much code would have to change if you ever switched, say, a table to a view or table-valued function.

                            • cb says:

                              I write a fair amount of C# and SQL, my 2 cents this would be similar to adding Object to the end of your class names. Plural names for tables are in my view right, but years of singular names makes it hurt.

                            • Paul Hunter says:

                              Pretty much 90%+ in agreement.
                              I convey index info in a prefix (ix, cx, uk). I use Child_has_Parent for foreign keys. I use suffix of _vw for views. Definitely 100% on board with Object_Verb for procedures. I can go either way on plural/single table names. I like Id for identity columns and use Cd (code) for manually maintained idenfifiers. For example StateCd for a list of states, CountryCd for countries. It also works for enumerated lookup table entries. I've considered using Sq for sequence values.

                            • Aaron Bertrand says:

                              Sure, it's all quite subjective, but I abhor abbreviations like Cd for Code. Is saving two characters of typing really worth it? I have a different post planned for these perceived productivity boosters that really aren't…

                            • CraigS says:

                              Particularly true if the abbreviation is short for something else common. Is StateCd "State Code" or "State CD"? OK if you think about it, but reading can be confusing.

                            • Kerneels Roos says:

                              No, totally agree!
                              Abbreviations should only be allowed when it is a very well known, generally accepted thing, like SSN, or when the full phrase would simply be far too long, but even then only when it is a widely accepted business term.

                              I'd go so far as to say, avoid abbreviations at all cost since you *will* forget what it stands for eventually, and it is a silly barriar of entry to others.

                              I've really gained a lot from a convention about date fields: always have it in the form of 'DateOfXyz' and never 'XyzDate'. That way you never wonder which way round it is.

                              The plural table naming bugged me long time ago, when I thought it cool to write SQL that looked like some ORM query language (Panda.Weight, Pool.Ph), but plural is far more sensible imo.

                            • Ivan says:

                              I truly believed that table names should be in singular but even if the ANSI SQL says otherwise, my personal preference will be still singular. The English language has too way many exceptions for the "pluralism" :). For example: woman -> womans…oh no sorry, women. activity -> activitys…can't do that either: activities.

                              By the way, how do you name indexes after all?

                              Thanks for the great post.

                            • Aaron Bertrand says:

                              @Ivan that's fine, like I said, this is all subjective. I don't feel like English barriers should dictate whether a table looks like it contains a set, but to each their own. :-) I mean, you're going to have to talk about women or activities when you're discussing those sets with other people on the team, so…

                              As for indexes, I will typically name them with the key column(s), no IX prefix, and if there's anything special about it (like it's the PK or it's unique or it's clustered), add that/those in the suffix. So FirstName_LastName or, more likely, LastName_FirstName. If that is the clustered index, then LastName_FirstName_clustered; if it's unique, LastName_FirstName_unique_clustered; if there's a filter (can't be clustered), LastName_FirstName_filtered or, even more helpful, LastName_FirstName_filtered_IsActive.

                            • toolsmythe says:

                              I concur Ivan.

                              If, as the OP asserts, "By definition, a table is a set of things", then what is being conveyed by making a table name plural? What, in this case, is pluralization but a form of suffixing, and isn't suffixing bad?

                              At the end of the day, a table name is a way for human beings to talk about a set of data (a computer would be just as happy if you used guids or binary numbers to name tables). They should be named in a manner that makes it easier to talk about. What is easer for you to say, "The user table …" or "The users table …"? Personally, I have to slow down to say the second one.

                              With singular table names, I also don't have to spend time pondering grammar when I should be focusing on normalization.

                              You all name things whatever you want; I'm sticking with singular table names.

                              JP

                            • Bodhi Densmore says:

                              I have problems with some singular table names having an eponymous column name. The “User” table has a “User” column. The CurrencyType table has a CurrencyType Column. This causes problems when searching query code for column or table references. I prefer plural table names.
                              If you use synonyms to transition from a singular to a plural table name, you must be careful to dereference the synonym when looking in information_schema and other catalog views.

                            • Lee Markum says:

                              We are starting the conversation of coding style and naming convention at my job where I am their first DBA and so this post is very timely. Unfortunately the business has been around for 10 plus years with a mix of people naming things in a variety of ways. I am working with two other developers to hash out a standard and I plan to use this article as a point of discussion.

                            Leave A Comment

                            Your email address will not be published.