Having clause vs subquery

ferics2 picture ferics2 · Oct 17, 2012 · Viewed 11.4k times · Source

I could write a query using an aggregate function in two ways:

select team, count(min) as min_count
from table
group by team
having count(min) > 500

or

select * 
from (
    select team, count(min) as min_count
    from table
    group by team
) as A
where A.min_count > 500

Are there any performance benefits to either approach or are they functionally the same thing?

Answer

Gordon Linoff picture Gordon Linoff · Oct 18, 2012

The two versions are functionally the same. Well, the second is syntactically incorrect, but I assume you mean:

select * 
from (
    select team, count(min) as count
    from table
    group by team
) t
where count > 500

(You need the alias on the calculation and several leading databases require an alias on a subquery in a FROM clause.)

Being functionally equivalent does not mean that they are necessarily optimized the same way. There are often multiple ways to write a query that are functionally equivalent. However, the specific database engine/optimizer can choose (and often does choose) different optimization paths.

In this case, the query is so simple that it is hard to think of multiple optimization paths. For both versions, the engine basically has to aggregate teh query and then test the second column for the filter. I personally cannot see many variations on this theme. Any decent SQL engine should use indexes, if appropriate, in either both cases or neither.

So, the anwer to this specific question is that in any reasonable database, these should result in the same execution plan (i.e., in the use of indexes, the user of parallelism, and the choice of aggregation algorithm). However, being functionally equivalent does not mean that a given database engine is going to produce the same exeuction plan. So, the general answer is "no".