Why do I get a Keras LSTM RNN input_shape error?

Ravaal picture Ravaal · Mar 17, 2016 · Viewed 8.7k times · Source

I keep getting an input_shape error from the following code.

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM

def _load_data(data):
    """
    data should be pd.DataFrame()
    """
    n_prev = 10
    docX, docY = [], []
    for i in range(len(data)-n_prev):
        docX.append(data.iloc[i:i+n_prev].as_matrix())
        docY.append(data.iloc[i+n_prev].as_matrix())
    if not docX:
        pass
    else:
        alsX = np.array(docX)
        alsY = np.array(docY)
        return alsX, alsY

X, y = _load_data(dframe)
poi = int(len(X) * .8)
X_train = X[:poi]
X_test = X[poi:]
y_train = y[:poi]
y_test = y[poi:]

input_dim = 3

All of the above runs smoothly. This is where it goes wrong.

in_out_neurons = 2
hidden_neurons = 300
model = Sequential()
#model.add(Masking(mask_value=0, input_shape=(input_dim,)))
model.add(LSTM(in_out_neurons, hidden_neurons, return_sequences=False, input_shape=(len(full_data),)))
model.add(Dense(hidden_neurons, in_out_neurons))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, y_train, nb_epoch=10, validation_split=0.05)

It returns this error.

Exception: Invalid input shape - Layer expects input ndim=3, was provided with input shape (None, 10320)

When I check the website it says to specify a tuple "(e.g. (100,) for 100-dimensional inputs)."

That being said, my data set consists of one column with a length of 10320. I assume that that means that I should be putting (10320,) in as the input_shape, but I get the error anyways. Does anyone have a solution?

Answer

Radix picture Radix · Jun 28, 2016

My understanding is that both your input and your output are one dimensional vectors. The trick is to reshape them per Keras requirements:

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
import numpy as np

X= np.random.rand(1000)
y = 2*X

poi = int(len(X) * .8)
X_train = X[:poi]
y_train = y[:poi]

X_test = X[poi:]
y_test = y[poi:]

# you have to change your input shape (nb_samples, timesteps, input_dim)
X_train = X_train.reshape(len(X_train), 1, 1)
# and also the output shape (note that the output *shape* is 2 dimensional)
y_train = y_train.reshape(len(y_train), 1)


#in_out_neurons = 2 
in_out_neurons = 1

hidden_neurons = 300
model = Sequential()
#model.add(Masking(mask_value=0, input_shape=(input_dim,)))
model.add(LSTM(hidden_neurons, return_sequences=False, batch_input_shape=X_train.shape))
# only specify the output dimension
model.add(Dense(in_out_neurons))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, y_train, nb_epoch=10, validation_split=0.05)

# calculate test set MSE
preds = model.predict(X_test).reshape(len(y_test))
MSE = np.mean((preds-y_test)**2)

Here are the key points:

  • when you add your first layer, you are required to specify the number of hidden nodes, and your input shape. Consequent layers don't require the input shape as they can infer it from the hidden nodes of the previous layer
  • Similarly, for your output layer you only specify the number of output nodes

Hope this helps.