How important is the order of columns in indexes?

Abe Miessler picture Abe Miessler · Feb 18, 2010 · Viewed 89.6k times · Source

I've heard that you should put columns that will be the most selective at the beginning of the index declaration. Example:

CREATE NONCLUSTERED INDEX MyINDX on Table1
(
   MostSelective,
   SecondMost,
   Least
)

First off, is what I'm saying correct? If so, am i likely to see large differences in performance by rearranging the order of the columns in my index or is it more of a "nice to do" practice?

The reason I'm asking is because after putting a query through the DTA it recommended that I create an index that had almost all of the same columns in it as an existing index, just in a different order. I was considering just adding the missing columns to the existing index and calling it good. Thoughts?

Answer

Nick Craver picture Nick Craver · Feb 18, 2010

Look at an index like this:

Cols
  1   2   3
-------------
|   | 1 |   |
| A |---|   |
|   | 2 |   |
|---|---|   |
|   |   |   |
|   | 1 | 9 |
| B |   |   |
|   |---|   |
|   | 2 |   |
|   |---|   |
|   | 3 |   |
|---|---|   |

See how restricting on A first, as your first column eliminates more results than restricting on your second column first? It's easier if you picture how the index must be traversed across, column 1, then column 2, etc...you see that lopping off most of the results in the fist pass makes the 2nd step that much faster.

Another case, if you queried on column 3, the optimizer wouldn't even use the index, because it's not helpful at all in narrowing down the result sets. Anytime you're in a query, narrowing down the number of results to deal with before the next step means better performance.

Since the index is also stored this way, there's no backtracking across the index to find the first column when you're querying on it.

In short: No, it's not for show, there are real performance benefits.