I have gone through the official documentation but still can't understand what actually TimeDistributed
does as a layer in Keras model?
I couldn't understand the difference between TimeDistributed
and TimeDistributedDense
? When will someone use TimeDistributedDense
? Is it only to reduce training data set? Does it have other benefit?
Can anyone explain with a precise example that what these two type of layer wrappers does?
So - basically the TimeDistributedDense
was introduced first in early versions of Keras in order to apply a Dense
layer stepwise to sequences. TimeDistributed
is a Keras wrapper which makes possible to get any static (non-sequential) layer and apply it in a sequential manner. So if e.g. your layer accepts as an input something of shape (d1, .., dn)
thanks to TimeDistributed
wrapper your layer could accept an input with a shape of (sequence_len, d1, ..., dn)
by applying a layer provided to X[0,:,:,..,:]
, X[1,:,...,:]
, ...
, X[len_of_sequence,:,...,:]
.
An example of such usage might be using a e.g. pretrained convolutional layer to a short video clip by applying TimeDistributed(conv_layer)
where conv_layer
is applied to each frame of a clip. It produces the sequence of outputs which might be then consumed by next recurrent or TimeDistributed
layer.
It's good to know that usage of TimeDistributedDense
is depreciated and it's better to use TimeDistributed(Dense)
.