pandas: get all groupby values in an array

ru111 picture ru111 · Mar 12, 2019 · Viewed 14.8k times · Source

I'm sure this has been asked before, sorry if duplicate. Suppose I have the following dataframe:

df = pd.DataFrame({'key': ['A', 'B', 'C', 'A', 'B', 'C'],
                   'data': range(6)}, columns=['key', 'data'])

>>
    key data
0   A   0
1   B   1
2   C   2
3   A   3
4   B   4
5   C   5

Doing a groupby on 'key', df.groupby('key').sum() I know we can do things like:

>> 
    data
key 
A   3
B   5
C   7

What is the easiest way to get all the 'splitted' data in an array?:

>> 
    data
key 
A   [0, 3]
B   [1, 4]
C   [2, 5]

I'm not necessarily grouping by just one key, but with several other indexes as well ('year' and 'month' for example) which is why I'd like to use the groupby function, but preserve all the grouped values in an array.

Answer

anky picture anky · Mar 12, 2019

You can use apply(list):

print(df.groupby('key').data.apply(list).reset_index())

  key    data
0   A  [0, 3]
1   B  [1, 4]
2   C  [2, 5]