How to use the Embedding Layer for Recurrent Neural Network (RNN) in Keras

Kito picture Kito · Jan 29, 2016 · Viewed 9.3k times · Source

I'm rather new to Neural Networks and the Keras Library and I'm wondering how I can use the Embedding Layer as described here to mask my input data from a 2D tensor to a 3D tensor for a RNN.

Say my timeseries data looking as follows (with an increasing time):

X_train = [
   [1.0,2.0,3.0,4.0],
   [2.0,5.0,6.0,7.0],
   [3.0,8.0,9.0,10.0],
   [4.0,11.0,12.0,13.0],
   ...
] # with a length of 1000

Now, say I would want to give the RNN the last 2 feature vectors in order to predict the feature vector for time t+1.

Currently (without the Embedding Layer), I am creating the required 3D tensor with shape (nb_samples, timesteps, input_dim) myself (as in this example here).

Related to my example, the final 3D Tensor would then look as follows:

X_train_2 = [
  [[1.0,2.0,3.0,4.0],
   [2.0,5.0,6.0,7.0]],
  [[2.0,5.0,6.0,7.0],
   [3.0,8.0,9.0,10.0]],
  [[3.0,8.0,9.0,10.0],
   [4.0,11.0,12.0,13.0]],
  etc...
]

and Y_train:

Y_train = [
   [3.0,8.0,9.0,10.0],
   [4.0,11.0,12.0,13.0],
   etc...
]

My model looks as follows (adapted to the simplified example above):

num_of_vectors = 2
vect_dimension = 4

model = Sequential()
model.add(SimpleRNN(hidden_neurons, return_sequences=False, input_shape=(num_of_vectors, vect_dimension))) 
model.add(Dense(vect_dimension))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
model.fit(X_train, Y_train, batch_size=50, nb_epoch=10, validation_split=0.15)

And finally, my question would be, how can I avoid doing those 2D tensor to 3D tensor reshaping myself and use the Embedding layer instead? I guess after model = sequential() I would have to add something like:

model.add(Embedding(?????))

Probably the answer is rather simple, I'm simply confused by the documentation of the embedding layer.

Answer

nog picture nog · Feb 10, 2016

You can you it as follows:

Note:

  1. I generated some X and y as 0s just to give you some idea of the input structure.

  2. If you are having a multi class y_train, you will need to binarize.

  3. You might need to add padding if you have data of various length.

  4. If I understood correctly about predicting at time t+1, you might want to look at Sequence to Sequence learning.

Try something like:

hidden_neurons = 4
nb_classes =3
embedding_size =10

X = np.zeros((128, hidden_neurons), dtype=np.float32)
y = np.zeros((128, nb_classes), dtype=np.int8)


model = Sequential()
model.add(Embedding(hidden_neurons, embedding_size))
model.add(SimpleRNN(hidden_neurons, return_sequences=False)) 
model.add(Dense(nb_classes))
model.add(Activation("softmax"))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', class_mode="categorical")
model.fit(X, y, batch_size=1, nb_epoch=1)