How do you write a query where only a select number of rows are returned with either the highest or lowest column value.
i.e. A report with the 5 highest salaried employees?
The best way to do this is with analytic functions, RANK() or DENSE_RANK() ...
SQL> select * from (
2 select empno
3 , sal
4 , rank() over (order by sal desc) as rnk
5 from emp)
6 where rnk <= 5
7 /
EMPNO SAL RNK
---------- ---------- ----------
7839 5000 1
7788 3000 2
7902 3000 2
7566 2975 4
8083 2850 5
7698 2850 5
6 rows selected.
SQL>
DENSE_RANK() compresses the gaps when there is a tie:
SQL> select * from (
2 select empno
3 , sal
4 , dense_rank() over (order by sal desc) as rnk
5 from emp)
6 where rnk <= 5
7 /
EMPNO SAL RNK
---------- ---------- ----------
7839 5000 1
7788 3000 2
7902 3000 2
7566 2975 3
8083 2850 4
7698 2850 4
8070 2500 5
7 rows selected.
SQL>
Which behaviour you prefer depends upon your business requirements.
There is also the ROW_NUMBER() analytic function which we can use to return a precise number of rows. However, we should avoid using solutions based on row number unless the business logic is happy to arbitrarily truncate the result set in the event of a tie. There is a difference between asking for the five highest values and the first five records sorted by high values
There is also a non-analytic solution using the ROWNUM pseudo-column. This is clunky because ROWNUM is applied before the ORDER BY clause, which can lead to unexpected results. There is rarely any reason to use ROWNUM instead of ROW_NUMBER() or one of the ranking functions.