Use Pandas groupby() + apply() with arguments

beta picture beta · Apr 19, 2017 · Viewed 37.6k times · Source

I would like to use df.groupby() in combination with apply() to apply a function to each row per group.

I normally use the following code, which usually works (note, that this is without groupby()):

df.apply(myFunction, args=(arg1,))

With the groupby() I tried the following:

df.groupby('columnName').apply(myFunction, args=(arg1,))

However, I get the following error:

TypeError: myFunction() got an unexpected keyword argument 'args'

Hence, my question is: How can I use groupby() and apply() with a function that needs arguments?

Answer

MaxU picture MaxU · Apr 19, 2017

pandas.core.groupby.GroupBy.apply does NOT have named parameter args, but pandas.DataFrame.apply does have it.

So try this:

df.groupby('columnName').apply(lambda x: myFunction(x, arg1))

or as suggested by @Zero:

df.groupby('columnName').apply(myFunction, ('arg1'))

Demo:

In [82]: df = pd.DataFrame(np.random.randint(5,size=(5,3)), columns=list('abc'))

In [83]: df
Out[83]:
   a  b  c
0  0  3  1
1  0  3  4
2  3  0  4
3  4  2  3
4  3  4  1

In [84]: def f(ser, n):
    ...:     return ser.max() * n
    ...:

In [85]: df.apply(f, args=(10,))
Out[85]:
a    40
b    40
c    40
dtype: int64

when using GroupBy.apply you can pass either a named arguments:

In [86]: df.groupby('a').apply(f, n=10)
Out[86]:
    a   b   c
a
0   0  30  40
3  30  40  40
4  40  20  30

a tuple of arguments:

In [87]: df.groupby('a').apply(f, (10))
Out[87]:
    a   b   c
a
0   0  30  40
3  30  40  40
4  40  20  30