Add a sequence number to each element in a group using python

DKA picture DKA · Mar 30, 2015 · Viewed 10.5k times · Source

I have a dataframe of individuals who each have multiple records. I want to enumerate the record in the sequence for each individual in python. Essentially I would like to create the 'sequence' column in the following table:

patient  date      sequence
145      20Jun2009        1
145      24Jun2009        2
145      15Jul2009        3
582      09Feb2008        1
582      21Feb2008        2
987      14Mar2010        1
987      02May2010        2
987      12May2010        3

This is essentially the same question as here, but I am working in python and unable to implement the sql solution. I suspect I can use a groupby statement with an iterable count, but have so far been unsuccessful. Thanks!

Answer

DKA picture DKA · Mar 30, 2015

I stumbled upon the answer which was embarrassingly simple. The groupby statement has a 'cumcount()' option which will enumerate group items.

df['sequence']=df.groupby('patient').cumcount()

The caveat is that the records have to be in the order you want them enumerated.