How can I identify groups of consecutive dates in SQL?

Pythonn00b picture Pythonn00b · Jul 10, 2012 · Viewed 9.5k times · Source

Im trying to write a function which identifies groups of dates, and measures the size of the group.

I've been doing this procedurally in Python until now but I'd like to move it into SQL.

for example, the list

Bill 01/01/2011 
Bill 02/01/2011 
Bill 03/01/2011 
Bill 05/01/2011 
Bill 07/01/2011 

should be output into a new table as:

Bill 01/01/2011  3 
Bill 02/01/2011  3 
Bill 03/01/2011  3 
Bill 05/01/2011  1 
Bill 07/01/2011  1

Ideally this should also be able to account for weekends and public holidays - the dates in my table will aways be Mon-Fri (I think I can solve this by making a new table of working days and numbering them in sequence). Someone at work suggested I try a CTE. Im pretty new to this, so I'd appreciate any guidance anyone could provide! Thanks.

Answer

Gordon Linoff picture Gordon Linoff · Jul 10, 2012

You can do this with a clever application of window functions. Consider the following:

select name, date, row_number() over (partition by name order by date)
from t

This adds a row number, which in your example would simply be 1, 2, 3, 4, 5. Now, take the difference from the date, and you have a constant value for the group.

select name, date,
       dateadd(d, - row_number() over (partition by name order by date), date) as val
from t

Finally, you want the number of groups in sequence. I would also add a group identifier (for instance, to distinguish between the last two).

select name, date,
       count(*) over (partition by name, val) as NumInSeq,
       dense_rank() over (partition by name order by val) as SeqID
from (select name, date,
             dateadd(d, - row_number() over (partition by name order by date), date) as val
      from t
     ) t

Somehow, I missed the part about weekdays and holidays. This solution does not solve that problem.