pandas groupby sort within groups

JoeDanger picture JoeDanger · Jan 8, 2015 · Viewed 323.3k times · Source

I want to group my dataframe by two columns and then sort the aggregated results within the groups.

In [167]:
df

Out[167]:
count   job source
0   2   sales   A
1   4   sales   B
2   6   sales   C
3   3   sales   D
4   7   sales   E
5   5   market  A
6   3   market  B
7   2   market  C
8   4   market  D
9   1   market  E

In [168]:
df.groupby(['job','source']).agg({'count':sum})

Out[168]:
            count
job     source  
market  A   5
        B   3
        C   2
        D   4
        E   1
sales   A   2
        B   4
        C   6
        D   3
        E   7

I would now like to sort the count column in descending order within each of the groups. And then take only the top three rows. To get something like:

            count
job     source  
market  A   5
        D   4
        B   3
sales   E   7
        C   6
        B   4

Answer

tvashtar picture tvashtar · Mar 18, 2016

You could also just do it in one go, by doing the sort first and using head to take the first 3 of each group.

In[34]: df.sort_values(['job','count'],ascending=False).groupby('job').head(3)

Out[35]: 
   count     job source
4      7   sales      E
2      6   sales      C
1      4   sales      B
5      5  market      A
8      4  market      D
6      3  market      B