R neuralnet does not converge within stepmax for time series

bkpenator picture bkpenator · May 19, 2013 · Viewed 12.6k times · Source

I'm writing a neural network for prediction of elements in a time series x + sin(x^2) in R, using the neuralnet package. This is how training data is being generated, assuming a window of 4 elements, and that the last one is the one that has to be predicted:

nntr0 <- ((1:25) + sin((1:25)^2))
nntr1 <- ((2:26) + sin((2:26)^2))
nntr2 <- ((3:27) + sin((3:27)^2))
nntr3 <- ((4:28) + sin((4:28)^2))
nntr4 <- ((5:29) + sin((5:29)^2))

Then, I turn these into a data.frame:

nntr <- data.frame(nntr0, nntr1, nntr2, nntr3, nntr4)

Then, I proceed to train the NN:

net.sinp <- neuralnet(nntr4 ~ nntr0 + nntr1 + nntr2 + nntr3, data=nntr, hidden=10, threshold=0.04, act.fct="tanh", linear.output=TRUE, stepmax=100000)

Which, after a while, gives me the message

Warning message:
algorithm did not converge in 1 of 1 repetition(s) within the stepmax 
Call: neuralnet(formula = nntr4 ~ nntr0 + nntr1 + nntr2 + nntr3, data = nntr,     hidden = 10, threshold = 0.04, stepmax = 100000, act.fct = "tanh", linear.output = TRUE)

Can anyone help me figure out why it is not converging? Many thanks

Answer

Vincent Zoonekynd picture Vincent Zoonekynd · May 19, 2013

With tanh as an activation function (it is bounded), it is very difficult to reproduce the linear trend in your signal.

You can use linear activation functions instead, or try to detrend the signal.

# Data
dx <- 1
n <- 25
x <- seq(0,by=dx,length=n+4)
y <- x + sin(x^2)
y0 <- y[1:n]
y1 <- y[1 + 1:n]
y2 <- y[2 + 1:n]
y3 <- y[3 + 1:n]
y4 <- y[4 + 1:n]
d <- data.frame(y0, y1, y2, y3, y4)
library(neuralnet)

# Linear activation functions
r <- neuralnet(y4 ~ y0 + y1 + y2 + y3, data=d, hidden=10)
plot(y4, compute(r, d[,-5])$net.result)

# No trend
d2 <- data.frame(
  y0 = y0 - x[1:n], 
  y1 = y1 - x[1 + 1:n], 
  y2 = y2 - x[2 + 1:n], 
  y3 = y3 - x[3 + 1:n], 
  y4 = y4 - x[4 + 1:n]
)
r <- neuralnet(y4 ~ y0 + y1 + y2 + y3, data=d2, hidden=10, act.fct="tanh" )
plot(d2$y4, compute(r, d2[,-5])$net.result)