The curious case of ARIMA modelling using R

Shreyes picture Shreyes · Apr 8, 2015 · Viewed 7.2k times · Source

I observed something strange while fitting an ARMA model using the function arma{tseries} and arima{stats} in R.

There is a radical difference in the estimation procedures adopted by the two functions, Kalman filter in arima{stats} as opposed to ML estimation in arma{tseries}.

Given the difference in the estimation procedures between the two functions, one would not expect the results to be radically different for the two function if we use the same timeseries.

Well seems that they can!

Generate the below timeseries and add 2 outliers.

set.seed(1010)
ts.sim <- abs(arima.sim(list(order = c(1,0,0), ar = 0.7), n = 50))
ts.sim[8] <- ts.sim[12]*8
ts.sim[35] <- ts.sim[32]*8

Fit an ARMA model using the two function.

# Works perfectly fine
arima(ts.sim, order = c(1,0,0))
# Works perfectly fine
arma(ts.sim, order = c(1,0))

Change the level of the timeseries by a factor of 1 billion

# Introduce a multiplicative shift
ts.sim.1 <- ts.sim*1000000000
options(scipen = 999)
summary(ts.sim.1)

Fit an ARMA model using the 2 functions:

# Works perfectly fine
arma(ts.sim.1, order = c(1,0))

# Does not work
arima(ts.sim.1, order = c(1,0,0))

## Error in solve.default(res$hessian * n.used, A): system is 
 computationally singular: reciprocal condition number = 1.90892e-19

Where I figured out this problem was when SAS software was successfully able to run the proc x12 procedure to conduct the seasonality test but the same function on R gave me the error above. That made me really wonder and look at the SAS results with skepticism but it turn out, it might just be something to do with the arima{stats}.

Can anyone try to elaborate the reason for the above error which limits us to fit a model using arima{stats}?

Answer

Rob Hyndman picture Rob Hyndman · Apr 9, 2015

The problem arises in the stats::arima function when calculating the covariance matrix of the coefficients. The code is not very robust to scale effects due to large numbers, and crashes in computing the inverse of the Hessian matrix in this line:

var <- crossprod(A, solve(res$hessian * n.used, A))

The problem is avoided by simply scaling the data. For example

arima(ts.sim.1/100, order = c(1,0,0))

will work.

The tseries::arma function does not work "perfectly fine" though. It returns a warning message:

In arma(ts.sim.1, order = c(1, 0)) : Hessian negative-semidefinite

This can also be avoided by scaling:

arma(ts.sim.1/1000, order = c(1,0))