Pandas' EMA not matching the stock's EMA?

Rob M picture Rob M · Feb 13, 2018 · Viewed 11.5k times · Source

I am trying to use Python (with Pandas) to calculate the 20-day Exponential Moving Averages (EMA) of daily stock data for Intel (INTC). Pandas has a number of ways of doing this, and I've also tried stockstats, which runs on Pandas, but they never return the same EMA as I get from stock/finance websites.

I've double checked the close prices, and they match, but the EMA always comes out "wrong".

This is the CSV I'm using: INTC Stock Data

It contains the daily Date, Month Name, Open, High, Low, Close, Day Avg, and Volume for Intel's stock (Ticker: INTC) from 4/20/2016 to 2/1/2018.

When I look to the bigger stock websites like MarketWatch or Fidelity, their numbers don't match mine. They match each other, but not me.

For example...

df2['Close'].ewm(span=20,min_periods=0,adjust=False,ignore_na=False).mean()

or...

df2['Close'].ewm(span=20, min_periods=20, adjust=True).mean()

or...

df2["Close"].shift().fillna(df["Close"]).ewm(com=1, adjust=False).mean()

Give me EMA's for 2/1/2018 like $44.71, $47.65, $46.15, etc. when the real 20-Day EMA on any finance site is $45.65. And I get the wrong numbers no matter what date I try to compute the EMA for. It's even wrong when I just try for 5-Day EMAs.

I've read, watched and followed tutorials on the subject, but their results also don't match the accepted/published EMA's you'd find on any finance site. The people creating the tutorials and videos simply never check them against each other after Panda's crunches the numbers. And I need my numbers to match.

How do I arrive at the same figures every other finance site on the internet is getting for EMAs? I don't think this has anything to do with adjusted close prices because I'm using old/settled data and my close prices and dates are the same as theirs.

Answer

unutbu picture unutbu · Feb 13, 2018

Sort the DataFrame so that the dates are in increasing order. Since your data is in decreasing order by date, if you don't sort the dates first, your ewm calculation exponentially weights the earliest dates the most, rather than the latest date (as it should be).

import pandas as pd

df = pd.read_csv('intc_data.txt', parse_dates=['Date'], index_col=['Date'])
df['backward_ewm'] = df['Close'].ewm(span=20,min_periods=0,adjust=False,ignore_na=False).mean()
df = df.sort_index()
df['ewm'] = df['Close'].ewm(span=20,min_periods=0,adjust=False,ignore_na=False).mean()
print(df[['ewm', 'backward_ewm']].tail())

yields

                  ewm  backward_ewm
Date                               
2018-01-26  45.370936     48.205638
2018-01-29  45.809895     48.008337
2018-01-30  46.093714     47.800794
2018-01-31  46.288599     47.696667
2018-02-01  46.418256     47.650000

This agrees with Marketwatch which says the EWMA(20) on 2018-02-01 was 46.42.

enter image description here