Bidirectional LSTM with Batch Normalization in Keras

abolfazl picture abolfazl · Jun 21, 2017 · Viewed 8k times · Source

I was wondering how to implement biLSTM with Batch Normalization (BN) in Keras. I know that BN layer should be between linearity and nonlinearity, i.e., activation. This is easy to implement with CNN or Dense layers. But, how to do this with biLSTM?

Thanks in advance.

Answer

albarji picture albarji · Aug 31, 2017

If you want to apply BatchNormalization over the linear outputs of an LSTM you can do it as

from keras.models import Sequential
from keras.layers.recurrent import LSTM
from keras.layers.wrappers import Bidirectional
from keras.layers.normalization import BatchNormalization

model = Sequential()
model.add(Bidirectional(LSTM(128, activation=None), input_shape=(256,10)))
model.add(BatchNormalization())

Essentially, you are removing the non-linear activations of the LSTM (but not the gate activations), and then applying BatchNormalization to the outpus.

If what you want is to apply BatchNormalization into one of the inside flows of the LSTM, such as recurrent flows, I'm afraid that feature has not been implemented in Keras.