Accuracy testing of forecasts

Summer-Jade Gleek'away picture Summer-Jade Gleek'away · Sep 16, 2014 · Viewed 9k times · Source

I found a site which explains exactly what I need to do for my data however it isn't in R. Can anyone suggest how I could create this in R?

http://people.duke.edu/~rnau/three.htm

I need to find the MSE, MAE, MAPE, ME, MPE, SSE to test the accuracy of the forecasts and this page is the closest i have found to explain how to do it.

data<-c(79160.56266,91759.73029,91186.47551,106353.8192,70346.46525,80279.15139,82611.60076,131392.7209,93798.99391,105944.7752,103913.1296,154530.6937,110157.4025,117416.0942,127423.4206,156751.9979,120097.8068,121307.7534,115021.1187,150657.8258,113711.5282,115353.1395,112701.9846,154319.1785,116803.545,118352.535)
forecasts<-c(118082.3,157303.8,117938.7,122329.8) # found using arima

(if you mark this question down can you explain specifically why please)

Answer

nrussell picture nrussell · Sep 16, 2014

Here are a few examples to get you started, using the data set UKNonDurables from the package AER. This package accompanies the book Applied Econometrics with R, which is a pretty good introductory applied econometrics book, especially for people without a solid background in programming.

library(forecast)
library(AER) 
##
data("UKNonDurables")
## alias for convenience
Data <- UKNonDurables
## split data into testing and training
train <- window(
  Data,
  end=c(1975,4))
test <- window(
  Data,
  start=c(1976,1))
## fit a model on training data
aaFit <- auto.arima(
  train)
## forcast training model over
## the testing period
aaPred <- forecast(
  aaFit,
  h=length(test))
##
> plot(aaPred)

enter image description here

## extract point forecasts
yHat <- aaPred$mean
## a few functions:
## mean squared (prediction) error
MSE <- function(y,yhat)
{
  mean((y-yhat)**2)
}
## mean absolute (prediction) error
MAE <- function(y,yhat)
{
  mean(abs(y-yhat))
}
## mean absolute percentage (prediction) error
MAPE <- function(y,yhat,percent=TRUE)
{
  if(percent){
    100*mean(abs( (y-yhat)/y ))
  } else {
    mean(abs( (y-yhat)/y ))
  }
}
##
> MSE(test,yHat)
[1] 9646434
> MAE(test,yHat)
[1] 1948.803
> MAPE(test,yHat)
[1] 3.769978

So like I said, some or all of the above functions probably exist in base R or within external packages, but they are typically simple formulas that are trivial to implement. Try to work off these and / or adapt them to better suit your needs.

Edit: As Mr. Hyndman pointed out below, his package forecast includes the function accuracy, which provides a very convenient way of summarizing GOF measures of time series models. Using the same data from above, you can easily assess the fit of a forecast object over the training and testing periods:

> round(accuracy(aaPred,Data),3)
                   ME     RMSE      MAE   MPE  MAPE  MASE  ACF1 Theil's U
Training set    2.961  372.104  277.728 0.001 0.809 0.337 0.053        NA
Test set     1761.016 3105.871 1948.803 3.312 3.770 2.364 0.849     1.004

(where round(...,3) was used just so that the output would fit nicely in this post). Or, if you want to examine these measures for only the forecast period, you can call something like this:

> accuracy(yHat,test)
               ME     RMSE      MAE      MPE     MAPE      ACF1 Theil's U
Test set 1761.016 3105.871 1948.803 3.312358 3.769978 0.8485389  1.004442