SQL Group By function(column) - Now can't Select that column

Mark Estrada picture Mark Estrada · May 26, 2011 · Viewed 7.7k times · Source

I used the HR employee schema in Oracle Express and I wanted to select employees that were hired on a particular year.

  SELECT hire_date, 
         COUNT(*)
    FROM employees empl
GROUP BY SUBSTR(hire_date, -4)
ORDER BY empl.hire_date;

The hire_date column has this format "1/1/2011" so I'd like to group them by extracting the last four char.

Problem is, I am encountering below error

ORA-00979: not a GROUP BY expression
00979. 00000 -  "not a GROUP BY expression"
*Cause:    
*Action:
Error at Line: 1 Column: 7

Is this not possible?

Answer

paxdiablo picture paxdiablo · May 26, 2011

You can't select the full hire_date if you're grouping by only the last four digits of it. Think of what will happen if you have the two rows:

hire_date
=========
01/01/2001
02/02/2001

In the single row generated when you group those, what should the hire_date be?

Every column selected must either be a group-by column or an aggregate column. In other words, try:

select substr(hire_date,-4), count(*)
from employees
group by substr(hire_date,-4)
order by empl.hire_date;

I should mention that per-row functions are notoriously bad at scaling. If you want to process the year a lot, you should consider splitting it into its own column. That may greatly improve performance, but measure, don't guess!

And, as others have mentioned in comments, substr is probably not the best solution since that may depend on locale (as in: it may be possible for the date to be formatted as YYYY-MM-DD which will not go well with the substring).

It may be better to use something like to_char(hire_date,'YYYY') or extract (year from hire_date) which should be more robust.