Why no windowed functions in where clauses?

Chris Pfohl picture Chris Pfohl · Dec 21, 2012 · Viewed 43.7k times · Source

Title says it all, why can't I use a windowed function in a where clause in SQL Server?

This query makes perfect sense:

select id, sales_person_id, product_type, product_id, sale_amount
from Sales_Log
where 1 = row_number() over(partition by sales_person_id, product_type, product_id order by sale_amount desc)

But it doesn't work. Is there a better way than a CTE/Subquery?

EDIT

For what its worth this is the query with a CTE:

with Best_Sales as (
    select id, sales_person_id, product_type, product_id, sale_amount, row_number() over (partition by sales_person_id, product_type, product_id order by sales_amount desc) rank
    from Sales_log
)
select id, sales_person_id, product_type, product_id, sale_amount
from Best_Sales
where rank = 1

EDIT

+1 for the answers showing with a subquery, but really I'm looking for the reasoning behind not being able to use windowing functions in where clauses.

Answer

Martin Smith picture Martin Smith · Dec 21, 2012

why can't I use a windowed function in a where clause in SQL Server?

One answer, though not particularly informative, is because the spec says that you can't.

See the article by Itzik Ben Gan - Logical Query Processing: What It Is And What It Means to You and in particular the image here. Window functions are evaluated at the time of the SELECT on the result set remaining after all the WHERE/JOIN/GROUP BY/HAVING clauses have been dealt with (step 5.1).

really I'm looking for the reasoning behind not being able to use windowing functions in where clauses.

The reason that they are not allowed in the WHERE clause is that it would create ambiguity. Stealing Itzik Ben Gan's example from High-Performance T-SQL Using Window Functions (p.25)

Suppose your table was

CREATE TABLE T1
(
col1 CHAR(1) PRIMARY KEY
)

INSERT INTO T1 VALUES('A'),('B'),('C'),('D'),('E'),('F')

And your query

SELECT col1
FROM T1
WHERE ROW_NUMBER() OVER (ORDER BY col1) <= 3
AND col1 > 'B'

What would be the right result? Would you expect that the col1 > 'B' predicate ran before or after the row numbering?