Lag() with condition in sql server

user3292586 picture user3292586 · Feb 14, 2014 · Viewed 21.3k times · Source

i have a table like this:

Number   Price    Type       Date         Time
------   -----    ----    ----------    ---------
23456    0,665     SV     2014/02/02     08:00:02
23457    1,3       EC     2014/02/02     07:50:45
23460    0,668     SV     2014/02/02     07:36:34

For each EC I need previous/next SV price. In this case, the query is simple.

Select Lag(price, 1, price) over (order by date desc, time desc),
Lead(price, 1, price) over (order by date desc, time desc)
from ITEMS

But, there are some special cases where two or more rows are EC type:

Number   Price    Type       Date         Time
------   -----    ----    ----------    ---------
23456    0,665     SV     2014/02/02     08:00:02
23457    1,3       EC     2014/02/02     07:50:45
23658    2,4       EC     2014/02/02     07:50:45
23660    2,4       EC     2014/02/02     07:50:48
23465    0,668     SV     2014/02/02     07:36:34 

can I use Lead/Lag in this cases? If not, did I have to use a subquery?

Answer

Andomar picture Andomar · Feb 14, 2014

Your question (and Anon's excellent answer) is part of the SQL of islands and gaps. In this answer, I will try to examine the "row_number() magic" in detail.

I've made a simple example based on events in a ballgame. For each event, we'd like to print the previous and next quarter related message:

create table TestTable (id int identity, event varchar(64));
insert TestTable values
    ('Start of Q1'),
    ('Free kick'),
    ('Goal'),
    ('End of Q1'),
    ('Start of Q2'),
    ('Penalty'),
    ('Miss'),
    ('Yellow card'),
    ('End of Q2');

Here's a query showing off the "row_number() magic" approach:

; with  grouped as
        (
        select  *
        ,       row_number() over (order by id) as rn1
        ,       row_number() over (
                    partition by case when event like '%of Q[1-4]' then 1 end 
                    order by id) as rn2
        from    TestTable
        )
,       order_in_group as
        (
        select  *
        ,       rn1-rn2 as group_nr
        ,       row_number() over (partition by rn1-rn2 order by id) as rank_asc
        ,       row_number() over (partition by rn1-rn2 order by id desc)
                    as rank_desc
        from    grouped
        )
select  *
,       lag(event, rank_asc) over (order by id) as last_event_of_prev_group
,       lead(event, rank_desc) over (order by id) as first_event_of_next_group
from    order_in_group
order by
        id
  • The first CTE called "grouped" calculates two row_number()s. The first is 1 2 3 for each row in the table. The second row_number() places pause announcements in one list, and other events in a second list. The difference between the two, rn1 - rn2, is unique for each section of the game. It's helpful to check difference in the example output: it's in the group_nr column. You'll see that each value corresponds to one section of the game.
  • The second CTE called "order_in_group" determines the position of the current row within its island or gap. For an island with 3 rows, the positions are 1 2 3 for the ascending order, and 3 2 1 for the descending order.
  • Finally, we know enough to tell lag() and lead() how far to jump. We have to lag rank_asc rows to find the final row of the previous section. To find the first row of the next section, we have to lead rank_desc rows.

Hope this helps clarifying the "magic" of Gaps and Islands. Here is a working example at SQL Fiddle.