I have a table in SQL Server 2012 as the snapshot shows:
Then I'm using Last_Value() and First Value to get AverageAmount of each EmpID for different YearMonth. The script is as follows:
SELECT A.EmpID,
First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS '200901AvgAmount',
Last_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS '201112AvgAmount'
FROM Emp_Amt AS A
However, the result for this query is:
In column of "201112AvgAmount", it shows different values for each EmpID while "200901AvgAmount" has correct values.
Is there anything wrong with my SQL script? I did a lot research online but still cannot find the answer....
Here's a quick query to illustrate the behaviour:
select
v,
-- FIRST_VALUE() and LAST_VALUE()
first_value(v) over(order by v) f1,
first_value(v) over(order by v rows between unbounded preceding and current row) f2,
first_value(v) over(order by v rows between unbounded preceding and unbounded following) f3,
last_value (v) over(order by v) l1,
last_value (v) over(order by v rows between unbounded preceding and current row) l2,
last_value (v) over(order by v rows between unbounded preceding and unbounded following) l3,
-- For completeness' sake, let's also compare the above with MAX()
max (v) over() m1,
max (v) over(order by v) m2,
max (v) over(order by v rows between unbounded preceding and current row) m3,
max (v) over(order by v rows between unbounded preceding and unbounded following) m4
from (values(1),(2),(3),(4)) t(v)
The output of the above query can be seen here (SQLFiddle here):
| V | F1 | F2 | F3 | L1 | L2 | L3 | M1 | M2 | M3 | M4 |
|---|----|----|----|----|----|----|----|----|----|----|
| 1 | 1 | 1 | 1 | 1 | 1 | 4 | 4 | 1 | 1 | 4 |
| 2 | 1 | 1 | 1 | 2 | 2 | 4 | 4 | 2 | 2 | 4 |
| 3 | 1 | 1 | 1 | 3 | 3 | 4 | 4 | 3 | 3 | 4 |
| 4 | 1 | 1 | 1 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
Few people think of the implicit frames that are applied to window functions that take an ORDER BY
clause. In this case, windows are defaulting to the frame RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
. (RANGE is not exactly the same as ROWS, but that's another story). Think about it this way:
v = 1
the ordered window's frame spans v IN (1)
v = 2
the ordered window's frame spans v IN (1, 2)
v = 3
the ordered window's frame spans v IN (1, 2, 3)
v = 4
the ordered window's frame spans v IN (1, 2, 3, 4)
If you want to prevent that behaviour, you have two options:
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
clause for ordered window functionsORDER BY
clause in those window functions that allow for omitting them (as MAX(v) OVER()
)More details are explained in this article about LEAD()
, LAG()
, FIRST_VALUE()
and LAST_VALUE()