Group by without an aggregate function

zoran119 picture zoran119 · Mar 20, 2017 · Viewed 18.7k times · Source

I've seen a pandasql query like this:

df = pd.DataFrame({'A': [1, 2, 2], 'B': [3, 4, 5]})
sqldf('select * from df group by A', locals())

This gives:

   A  B
0  1  3
1  2  6

I find it really weird to have a group by without an aggregate function, but can anyone tell me which function is used on the aggregated columns to reduce multiple values into one?

Answer

Andrew L picture Andrew L · Mar 20, 2017

It looks like the groupby method you're looking for is last():

df = pd.DataFrame({'A': [1, 2, 2], 'B': [3, 4, 5]})
df.groupby('A', as_index=False).last()

Output:

   A  B
0  1  3
1  2  5

I'm saying this assuming the 5 was a typo (see my comment above) and meant to be 6.