I have a time series dataset containing data from a whole year (date is the index). The data was measured every 15 min (during whole year) which results in 96 timesteps a day. The data is already normalized. The variables are correlated. All the variables except the VAR are weather measures.
VAR is seasonal in a day period and in a week period (as it looks a bit different on weekend, but more less the same every weekend). VAR values are stationary. I would like to predict values of VAR for next two days (192 steps ahead) and for next seven days (672 steps ahead).
Here is the sample of the dataset:
DateIdx VAR dewpt hum press temp
2017-04-17 00:00:00 0.369397 0.155039 0.386792 0.196721 0.238889
2017-04-17 00:15:00 0.363214 0.147287 0.429245 0.196721 0.233333
2017-04-17 00:30:00 0.357032 0.139535 0.471698 0.196721 0.227778
2017-04-17 00:45:00 0.323029 0.127907 0.429245 0.204918 0.219444
2017-04-17 01:00:00 0.347759 0.116279 0.386792 0.213115 0.211111
2017-04-17 01:15:00 0.346213 0.127907 0.476415 0.204918 0.169444
2017-04-17 01:30:00 0.259660 0.139535 0.566038 0.196721 0.127778
2017-04-17 01:45:00 0.205564 0.073643 0.523585 0.172131 0.091667
2017-04-17 02:00:00 0.157650 0.007752 0.481132 0.147541 0.055556
2017-04-17 02:15:00 0.122101 0.003876 0.476415 0.122951 0.091667
I have decided to use LSTM in Keras. Having data from the whole year, I have used data from past 329 days as a training data and the rest for a validation during the training. train_X -> contains whole measures including VAR from 329 days train_Y -> contains only VAR from 329 days. The value is shifted one step ahead. The rest timesteps goes to test_X and test_Y.
Here is the code I prepare train_X and train_Y:
#X -> is the whole dataframe
#Y -> is a vector of VAR from whole dataframe, already shifted 1 step ahead
#329 * 96 = 31584
train_X = X[:31584]
train_X = train_X.reshape(train_X.shape[0],1,5)
train_Y = Y[:31584]
train_Y = train_Y.reshape(train_Y.shape[0],1)
To predict next VAR value I would like to use past 672 timesteps (whole week measures). For this reason I have set batch_size=672
, so that the ‘fit’ command look like this:
history = model.fit(train_X, train_Y, epochs=50, batch_size=672, validation_data=(test_X, test_Y), shuffle=False)
Here is the architecture of my network:
model = models.Sequential()
model.add(layers.LSTM(672, input_shape=(None, 672), return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(336, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(168, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(84, return_sequences=True))
model.add(layers.Dropout(0.2))
model.add(layers.LSTM(21, return_sequences=False))
model.add(layers.Dense(1))
model.compile(loss='mae', optimizer='adam')
model.summary()
From the plot below we can see that the network has learn ‘something’ after 50 epochs:
Plot from the learning process
For the prediction purpose I have prepared a set of data containing last 672 steps with all values and 96 without VAR value – which should be predicted. I also used autoregression, so I updated VAR after each prediction and used it for next prediction.
The predX dataset (used for prediction) looks like this:
print(predX['VAR'][668:677])
DateIdx VAR
2017-04-23 23:00:00 0.307573
2017-04-23 23:15:00 0.278207
2017-04-23 23:30:00 0.284390
2017-04-23 23:45:00 0.309118
2017-04-24 00:00:00 NaN
2017-04-24 00:15:00 NaN
2017-04-24 00:30:00 NaN
2017-04-24 00:45:00 NaN
2017-04-24 01:00:00 NaN
Name: VAR, dtype: float64
Here is the code (autoregression) I have used to predict next 96 steps:
stepsAhead = 96
historySteps = 672
for i in range(0,stepsAhead):
j = i + historySteps
ypred = model.predict(predX.values[i:j].reshape(1,historySteps,5))
predX['VAR'][j] = ypred
Unfortunately the results are very poor and very far from the expectations:
Results combined with a previous day:
Predicted data combined with a previous day
Except from the ‘What have I done wrong‘ question, I would like to ask a few questions:
Q1. During model fifing, I have just put the whole history in batches of 672 size. Is it correct? How should I organize the dataset for the model fitting? What options do I have? Should I use the “sliding window” approach (like in the link here: https://machinelearningmastery.com/promise-recurrent-neural-networks-time-series-forecasting/ )?
Q2. Is the 50 epochs enough? What is the common practice here? Maybe the network is underfitted resulting in poor prediction? So far I tried 200 epoch with the same result.
Q3. Should I try a different architecture? Is the proposed network ‘big enough’ to handle such a data? Maybe a “stateful” network is the right approach here?
Q4. Did I implement the autoregression correctly? Is there any other approach to make a prediction for many steps ahead e.g. 192 or 672 like in my case?
It looks like there is a confusion on how to organise the data to train a RNN. So let's cover the questions:
(total_samples, 5)
you can use the TimeseriesGenerator to create a sliding window what will generate (batch_size, past_timesteps, 5)
for you. In this case, you will use .fit_generator
to train the network.n
number of predictions after training.The single point prediction model could look like:
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape=(past_timesteps, 5))
model.add(LSTM(64))
model.add(Dense(1))