This particular case was distilled from an example where the programmer assumed that for two shipments into a tank car, line #1 would be loaded first. I corrected this to allow for the loading to be performed in any order - however, I discovered that MIN() OVER (PARTITION BY)
allows an ORDER BY
in Oracle (this is not allowed in SQL Server), and additionally, it alters the behavior of the function, causing the ORDER BY
to apparently be added to the PARTITION BY
.
WITH data AS (
SELECT 1 AS SHIPMENT_ID, 1 AS LINE_NUMBER, 2 AS TARE, 3 AS GROSS FROM DUAL
UNION ALL
SELECT 1 AS SHIPMENT_ID, 2 AS LINE_NUMBER, 1 AS TARE, 2 AS GROSS FROM DUAL
)
SELECT MIN(tare) OVER (PARTITION BY shipment_id) first_tare
,MAX(gross) OVER (PARTITION BY shipment_id) last_gross
,FIRST_VALUE(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect
,FIRST_VALUE(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect
,MIN(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect_still
,MAX(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect_still
,MIN(tare) OVER (PARTITION BY shipment_id, LINE_NUMBER) first_tare_incorrect_still2
,MAX(gross) OVER (PARTITION BY shipment_id, LINE_NUMBER) last_gross_incorrect_still2
FROM data
A SQL Server example (with non-applicable code commented out):
WITH data AS (
SELECT 1 AS SHIPMENT_ID, 1 AS LINE_NUMBER, 2 AS TARE, 3 AS GROSS -- FROM DUAL
UNION ALL
SELECT 1 AS SHIPMENT_ID, 2 AS LINE_NUMBER, 1 AS TARE, 2 AS GROSS -- FROM DUAL
)
SELECT MIN(tare) OVER (PARTITION BY shipment_id) first_tare
,MAX(gross) OVER (PARTITION BY shipment_id) last_gross
-- ,FIRST_VALUE(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect
-- ,FIRST_VALUE(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect
-- ,MIN(tare) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER) first_tare_incorrect_still
-- ,MAX(gross) OVER (PARTITION BY shipment_id ORDER BY LINE_NUMBER DESC) last_gross_incorrect_still
,MIN(tare) OVER (PARTITION BY shipment_id, LINE_NUMBER) first_tare_incorrect_still2
,MAX(gross) OVER (PARTITION BY shipment_id, LINE_NUMBER) last_gross_incorrect_still2
FROM data
So question: What is Oracle doing and why and is it right?
If you add an ORDER BY
to the MIN
analytic function, you turn it into a "min so far" function rather than an overall minimum. For the final row for whatever you're partitioning by, the results will be the same. But the prior rows may have a different "min so far" than the overall minimum.
Using the EMP
table as an example, you can see that the minimum salary so far for the department eventually converges on the overall minimum for the department. And you can see that the "min so far" value for any given department decreases as lower values are encountered.
SQL> ed
Wrote file afiedt.buf
1 select ename,
2 deptno,
3 sal,
4 min(sal) over (partition by deptno order by ename) min_so_far,
5 min(sal) over (partition by deptno) min_overall
6 from emp
7* order by deptno, ename
SQL> /
ENAME DEPTNO SAL MIN_SO_FAR MIN_OVERALL
---------- ---------- ---------- ---------- -----------
CLARK 10 2450 2450 1300
KING 10 5000 2450 1300
MILLER 10 1300 1300 1300
ADAMS 20 1110 1110 800
FORD 20 3000 1110 800
JONES 20 2975 1110 800
SCOTT 20 3000 1110 800
smith 20 800 800 800
ALLEN 30 1600 1600 950
BLAKE 30 2850 1600 950
MARTIN 30 1250 1250 950
SM0 30 950 950 950
TURNER 30 1500 950 950
WARD 30 1250 950 950
BAR
PAV
16 rows selected.
Of course, it would make more sense to use this form of the analytic function when you're trying to do something like compute a personal best that you can use as a comparison in future periods. If you're tracking an individual's decreasing golf scores, mile times, or weight, displaying personal bests can be a form of motivation.
SQL> ed
Wrote file afiedt.buf
1 with golf_scores as
2 ( select 1 golfer_id, 80 score, sysdate dt from dual union all
3 select 1, 82, sysdate+1 dt from dual union all
4 select 1, 72, sysdate+2 dt from dual union all
5 select 1, 75, sysdate+3 dt from dual union all
6 select 1, 71, sysdate+4 dt from dual union all
7 select 2, 74, sysdate from dual )
8 select golfer_id,
9 score,
10 dt,
11 (case when score=personal_best
12 then 'New personal best'
13 else null
14 end) msg
15 from (
16 select golfer_id,
17 score,
18 dt,
19 min(score) over (partition by golfer_id
20 order by dt) personal_best
21 from golf_scores
22* )
SQL> /
GOLFER_ID SCORE DT MSG
---------- ---------- --------- -----------------
1 80 12-SEP-11 New personal best
1 82 13-SEP-11
1 72 14-SEP-11 New personal best
1 75 15-SEP-11
1 71 16-SEP-11 New personal best
2 74 12-SEP-11 New personal best
6 rows selected.