Finding the first row in a group using Hive

Praveen Sripati picture Praveen Sripati · Sep 24, 2014 · Viewed 22.3k times · Source

For a student database in the following format

Roll Number | School Name | Name | Age | Gender | Class | Subject | Marks

How to find out who got the highest for each class? The below query returns the entire group, but I am interested in finding the first row in the group.

select school,class,roll,sum(marks) as total from students group by school,class,roll order by school, class, total desc;

Answer

FuzzyTree picture FuzzyTree · Sep 24, 2014

Another way using row_number()

select * from (
    select *, 
    row_number() over (partition by school,class,roll order by marks desc) rn
    from students
) t1 where rn = 1

If you want to return all ties for top marks, then use rank() instead of row_number()