Why isn't index likely to do much if it's DATETIME or DATETIME2 because they include the time portion?

Gennady Vanin Геннадий Ванин picture Gennady Vanin Геннадий Ванин · Nov 20, 2010 · Viewed 9.8k times · Source

Comments to question "How to decrease response time of a simple select query?" tell:

  • "What is the data type on LaunchDate? An index isn't likely to do much if it's DATETIME or DATETIME2 because they include the time portion – OMG Ponies"

  • "@OMG - Why wouldn't a Clustered Index on a DateTime column improve performance? The query is a range scan which would allow for a fast range index lookup as all data would be in sequential blocks? Semi-related...msdn.microsoft.com/en-us/library/ms177416.aspx – Calgary Coder"

  • "Calgary Coder: DATETIME/2 includes time -- an index, clustered or non-clustered, would be good for dates with duplicate times but not ranges. – OMG Ponies"

I created a test table with clustered index on DATETIME type column LaunchDate and observe index seeks for queries similar to cited in above question:

SELECT COUNT(primaryKeyColumn) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

instead of table or index scans.

Why wouldn't a clustered index on a DateTime column improve performance?
Why isn't index likely to do much if it's DATETIME or DATETIME2 because they include the time portion?

I'd appreciate a script illustrating that indexing of DATETIME column does not improve performance.

Update: Also, Did OMG imply that index on DATE type column would be helpful but not DATETIME and DATETIME2?

Answer

gbn picture gbn · Nov 20, 2010

I've read the other question, no idea what OMG ponies means

3 points:

  • It shouldn't matter if an index is clustered or non-clustered:
  • It doesn't matter whether time is included too
  • It just has to be useful

Seek or scan:

Based on statistics, if LaunchDate > @date means, say, 90% of the rows, then most likely a scan will happen. If it is quite selective, then a seek is more likely.

Regardless of clustered or non-clustered!

What index?

A query like this would require an index on LaunchDate and primaryKeyColumn

SELECT COUNT(primaryKeyColumn) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

Now, any non-clustered index refers to the clustered index which is assumed to the PK by default. So primaryKeyColumn is implicitly included already.

Superstition

However, COUNT(primaryKeyColumn) is a superstition. Because PKs do not allow NULL, it is equivalent to

SELECT COUNT(*) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

SELECT COUNT(1) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

So you only need an index on LaunchDate, whether clustered or non-clustered