I've seen a pandasql
query like this:
df = pd.DataFrame({'A': [1, 2, 2], 'B': [3, 4, 5]})
sqldf('select * from df group by A', locals())
This gives:
A B
0 1 3
1 2 6
I find it really weird to have a group by without an aggregate function, but can anyone tell me which function is used on the aggregated columns to reduce multiple values into one?
It looks like the groupby method you're looking for is last()
:
df = pd.DataFrame({'A': [1, 2, 2], 'B': [3, 4, 5]})
df.groupby('A', as_index=False).last()
Output:
A B
0 1 3
1 2 5
I'm saying this assuming the 5 was a typo (see my comment above) and meant to be 6.