TimeDistributed vs. TimeDistributedDense Keras

Wasi Ahmad picture Wasi Ahmad · Feb 22, 2017 · Viewed 7.2k times · Source

I have gone through the official documentation but still can't understand what actually TimeDistributed does as a layer in Keras model?

I couldn't understand the difference between TimeDistributed and TimeDistributedDense? When will someone use TimeDistributedDense? Is it only to reduce training data set? Does it have other benefit?

Can anyone explain with a precise example that what these two type of layer wrappers does?

Answer

Marcin Możejko picture Marcin Możejko · Feb 22, 2017

So - basically the TimeDistributedDense was introduced first in early versions of Keras in order to apply a Dense layer stepwise to sequences. TimeDistributed is a Keras wrapper which makes possible to get any static (non-sequential) layer and apply it in a sequential manner. So if e.g. your layer accepts as an input something of shape (d1, .., dn) thanks to TimeDistributed wrapper your layer could accept an input with a shape of (sequence_len, d1, ..., dn) by applying a layer provided to X[0,:,:,..,:], X[1,:,...,:], ..., X[len_of_sequence,:,...,:].

An example of such usage might be using a e.g. pretrained convolutional layer to a short video clip by applying TimeDistributed(conv_layer) where conv_layer is applied to each frame of a clip. It produces the sequence of outputs which might be then consumed by next recurrent or TimeDistributed layer.

It's good to know that usage of TimeDistributedDense is depreciated and it's better to use TimeDistributed(Dense).