I have a dataset X which consists N = 4000 samples, each sample consists of d = 2 features (continuous values) spanning back t = 10 time steps. I also have the corresponding 'labels' of each sample which are also continuous values, at time step 11.
At the moment my dataset is in the shape X: [4000,20], Y: [4000].
I want to train an LSTM using TensorFlow to predict the value of Y (regression), given the 10 previous inputs of d features, but I am having a tough time implementing this in TensorFlow.
The main problem I have at the moment is understanding how TensorFlow is expecting the input to be formatted. I have seen various examples such as this, but these examples deal with one big string of continuous time series data. My data is different samples, each an independent time series.
The documentation of tf.nn.dynamic_rnn
states:
inputs
: The RNN inputs. Iftime_major == False
(default), this must be a Tensor of shape:[batch_size, max_time, ...]
, or a nested tuple of such elements.
In your case, this means that the input should have a shape of [batch_size, 10, 2]
. Instead of training on all 4000 sequences at once, you'd use only batch_size
many of them in each training iteration. Something like the following should work (added reshape for clarity):
batch_size = 32
# batch_size sequences of length 10 with 2 values for each timestep
input = get_batch(X, batch_size).reshape([batch_size, 10, 2])
# Create LSTM cell with state size 256. Could also use GRUCell, ...
# Note: state_is_tuple=False is deprecated;
# the option might be completely removed in the future
cell = tf.nn.rnn_cell.LSTMCell(256, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell,
input,
sequence_length=[10]*batch_size,
dtype=tf.float32)
From the documentation, outputs
will be of shape [batch_size, 10, 256]
, i.e. one 256-output for each timestep. state
will be a tuple of shapes [batch_size, 256]
. You could predict your final value, one for each sequence, from that:
predictions = tf.contrib.layers.fully_connected(state.h,
num_outputs=1,
activation_fn=None)
loss = get_loss(get_batch(Y).reshape([batch_size, 1]), predictions)
The number 256 in the shapes of outputs
and state
is determined by cell.output_size
resp. cell.state_size
. When creating the LSTMCell
like above, these are the same. Also see the LSTMCell documentation.