Pandas: Cumulative return function

Kelaref picture Kelaref · Nov 25, 2016 · Viewed 12k times · Source

I have a dataframe such as the following:

  Index      Return
2008-11-21   0.153419
2008-11-24   0.037421
2008-11-25   0.077500

What's the best way to calculate a cumulative return across all columns on the last row?

Following is the intended result:

  Index      Return
2008-11-21   0.153419
2008-11-24   0.037421
2008-11-25   0.077500
Cumulative   0.289316

Where cumulative return calculated as follows:

cumulative = (1 + return1) * (1 + return2) * (1 + return3) - 1 

What is the best way to perform this in pandas?

Answer

Steven G picture Steven G · Nov 25, 2016

there is a pandas cumprod() method for that. this will work for every columns.

df.ix["Cumulative"] = ((df+1).cumprod()-1).iloc[-1]

this would be about 2 time faster than other solutions on large dataset:

In[106]: %timeit df.ix["Cumulative"] = ((df+1).cumprod()-1).iloc[-1]
10 loops, best of 3: 18.4 ms per loop
In[107]: %timeit df.ix['Cummulative'] = df.apply(lambda x: (x+1).prod()-1)
10 loops, best of 3: 32.9 ms per loop
In[110]: %timeit df.append(df.iloc[:,1:].apply(lambda col: (col + 1).prod() - 1), ignore_index=True)
10 loops, best of 3: 37.1 ms per loop
In[113]: %timeit df.append(df.apply(lambda col: prod([(1+c) for c in col]) - 1), ignore_index=True)
1 loop, best of 3: 262 ms per loop

I would suggest to never use apply if you can find a built-in method since apply is looping over the dataframe which makes it slow. Bult-in method are highly efficient and normally there is no way you are going to get faster than them using apply.