With the following data
create table #ph (product int, [date] date, price int)
insert into #ph select 1, '20120101', 1
insert into #ph select 1, '20120102', 1
insert into #ph select 1, '20120103', 1
insert into #ph select 1, '20120104', 1
insert into #ph select 1, '20120105', 2
insert into #ph select 1, '20120106', 2
insert into #ph select 1, '20120107', 2
insert into #ph select 1, '20120108', 2
insert into #ph select 1, '20120109', 1
insert into #ph select 1, '20120110', 1
insert into #ph select 1, '20120111', 1
insert into #ph select 1, '20120112', 1
I would like to produce the following output:
product | date_from | date_to | price
1 | 20120101 | 20120105 | 1
1 | 20120105 | 20120109 | 2
1 | 20120109 | 20120112 | 1
If I group by price and show the max and min date then I will get the following which is not what I want (see the over lapping of dates).
product | date_from | date_to | price
1 | 20120101 | 20120112 | 1
1 | 20120105 | 20120108 | 2
So essentially what I'm looking to do is group by the step change in data based on group columns product and price.
What is the cleanest way to achieve this?
There's a (more or less) known technique of solving this kind of problem, involving two ROW_NUMBER()
calls, like this:
WITH marked AS (
SELECT
*,
grp = ROW_NUMBER() OVER (PARTITION BY product ORDER BY date)
- ROW_NUMBER() OVER (PARTITION BY product, price ORDER BY date)
FROM #ph
)
SELECT
product,
date_from = MIN(date),
date_to = MAX(date),
price
FROM marked
GROUP BY
product,
price,
grp
ORDER BY
product,
MIN(date)
Output:
product date_from date_to price
------- ---------- ------------- -----
1 2012-01-01 2012-01-04 1
1 2012-01-05 2012-01-08 2
1 2012-01-09 2012-01-12 1