HAVING clause without grouping by all non-aggregate columns in SELECT

user1392795 picture user1392795 · Mar 19, 2013 · Viewed 8.8k times · Source

HAVING clause without grouping by all non-aggregate columns in SELECT I am often faced with tables that have an id column and a month column along with many other metrics. I weed out certain rows based on other columns. I only want to id’s that have all 12 months’ worth of data, so I do the following:

proc sql;
   create table t as
   SELECT *
   FROM  T1
   GROUP BY id
   HAVING COUNT(id) = 12
quit;

It seems to work for me, but I was wondering if there are any dangers with this method. I know I can join an aggregated subquery with only id and the count of ids, but this method is much easier to explain to non sql users.

In general, can you used the HAVING clause with only a couple of the columns in the GROUP BY and no aggregate functions in the SELECT statement?

Answer

Dan Bracuk picture Dan Bracuk · Mar 19, 2013

It depends on the database. On one of mine (redbrick) the query your presented will throw an error. To make it work, I'd have to change

select * 

to

select id.  

However, you might have misguided motivation. I am usually interested in the most efficient way to get the correct answer. Clarity to myself and anyone else who might have to maintain it is also important, but clarity to laypeople is not one of my priorities.