I am wondering if there is a better way to test if two variables are cointegrated than the following method:
import numpy as np
import statsmodels.api as sm
import statsmodels.tsa.stattools as ts
y = np.random.normal(0,1, 250)
x = np.random.normal(0,1, 250)
def cointegration_test(y, x):
# Step 1: regress on variable on the other
ols_result = sm.OLS(y, x).fit()
# Step 2: obtain the residual (ols_resuld.resid)
# Step 3: apply Augmented Dickey-Fuller test to see whether
# the residual is unit root
return ts.adfuller(ols_result.resid)
The above method works; however, it is not very efficient. When I run sm.OLS
, a lot of things are calculated, not just the residuals, this of course increases the run time. I could of course write my own code that calculates just the residuals, but I don't think this will be very efficient either.
I looking for either a build in test that just tests for cointegration directly. I was thinking Pandas
, but don't seem to be able to find anything. Or maybe there is a clever to test for cointegration without running a regression, or some efficient method.
I have to run a lot of cointegration tests, and it would nice to improve on my current method.
You could try the following:
import statsmodels.tsa.stattools as ts
result=ts.coint(x, y)
Edit:
import statsmodels.tsa.stattools as ts
import numpy as np
import pandas as pd
import pandas.io.data as web
data1 = web.DataReader('FB', data_source='yahoo',start='4/4/2015', end='4/4/2016')
data2 = web.DataReader('AAPL', data_source='yahoo',start='4/4/2015', end='4/4/2016')
data1['key']=data1.index
data2['key']=data2.index
result = pd.merge(data1, data2, on='key')
x1=result['Close_x']
y1=result['Close_y']
coin_result = ts.coint(x1, y1)
The code is self explanatory:- 1) Import the necessary packages 2) Fetch data of Facebook and Apple stock for an year duration 3) Merge the data according to the date column 4) Choose the closing price 5) Conduct the cointegration test 6) The variable coin_result has the statistics of cointegration test