UPDATE: after looking back on this question, most of the code was unnecessary. To make a long story short, the hidden layer of a Pytorch RNN needs to be a torch tensor. When I posted the question, the hidden layer was a tuple.
Below is my data loader.
from torch.utils.data import TensorDataset, DataLoader
def batch_data(log_returns, sequence_length, batch_size):
"""
Batch the neural network data using DataLoader
:param log_returns: asset's daily log returns
:param sequence_length: The sequence length of each batch
:param batch_size: The size of each batch; the number of sequences in a batch
:return: DataLoader with batched data
"""
# total number of batches we can make
n_batches = len(log_returns)//batch_size
# Keep only enough characters to make full batches
log_returns = log_returns[:n_batches * batch_size]
y_len = len(log_returns) - sequence_length
x, y = [], []
for idx in range(0, y_len):
idx_end = sequence_length + idx
x_batch = log_returns[idx:idx_end]
x.append(x_batch)
# only making predictions after the last word in the batch
batch_y = log_returns[idx_end]
y.append(batch_y)
# create tensor datasets
x_tensor = torch.from_numpy(np.asarray(x))
y_tensor = torch.from_numpy(np.asarray(y))
# make x_tensor 3-d instead of 2-d
x_tensor = x_tensor.unsqueeze(-1)
data = TensorDataset(x_tensor, y_tensor)
data_loader = DataLoader(data, shuffle=False, batch_size=batch_size)
# return a dataloader
return data_loader
def init_hidden(self, batch_size):
''' Initializes hidden state '''
# Create two new tensors with sizes n_layers x batch_size x n_hidden,
# initialized to zero, for hidden state and cell state of LSTM
weight = next(self.parameters()).data
if (train_on_gpu):
hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda(),
weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda())
else:
hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_(),
weight.new(self.n_layers, batch_size, self.n_hidden).zero_())
return hidden
I don't know what is wrong. When I try to start training the model, I am getting the error message:
AttributeError: 'tuple' object has no attribute 'size'
The issue comes from the fact that hidden
(in the forward
definition) isn't a Torch.Tensor
. Therefore, r_output, hidden = self.gru(nn_input, hidden)
raises a rather confusing error without specifying exaclty what's wrong in the arguments. Altough you can see it's raised inside a nn.RNN
function named check_hidden_size()
...
I was confused at first, thinking that the second argument of nn.RNN
: h0
was a tuple containing (hidden_state, cell_state)
. Same can be said ofthe second element returned by that call: hn
. That's not the case h0
and hn
are both Torch.Tensor
s. Interestingly enough though, you are able to unpack stacked tensors:
>>> z = torch.stack([torch.Tensor([1,2,3]), torch.Tensor([4,5,6])])
>>> a, b = z
>>> a, b
(tensor([1., 2., 3.]), tensor([4., 5., 6.]))
You are supposed to provide a tensor as the second argument of a nn.GRU
__call__
.
Edit - After further inspection of your code I found out that you are converting hidden
back again to a tuple... In cell [14] you have hidden = tuple([each.data for each in hidden])
. Which basically overwrites the modification you did in init_hidden
with torch.stack
.
Take a step back and look at the source code for RNNBase the base class for RNN modules. If the hidden state is not given to the forward it will default to:
if hx is None:
num_directions = 2 if self.bidirectional else 1
hx = torch.zeros(self.num_layers * num_directions,
max_batch_size, self.hidden_size,
dtype=input.dtype, device=input.device)
This is essentially the exact init as the one you are trying to implement. Granted you only want to reset the hidden states on every epoch, (I don't see why...). Anyhow, a basic alternative would be to set hidden
to None
at the start of an epoch, passed as it is to self.forward_back_prop
then to rnn
, then to self.rnn
which will in turn default initialize it for you. Then overwrite hidden
with the hidden state returned by that RNN forward call.
To summarize, I've only kept the relevant parts of the code. Remove the init_hidden
function from AssetGRU
and make those modifications:
def forward_back_prop(rnn, optimizer, criterion, inp, target, hidden):
...
if hidden is not None:
hidden = hidden.detach()
...
output, hidden = rnn(inp, hidden)
...
return loss.item(), hidden
def train_rnn(rnn, batch_size, optimizer, criterion, n_epochs, show_every_n_batches):
...
for epoch_i in range(1, n_epochs + 1):
hidden = None
for batch_i, (inputs, labels) in enumerate(train_loader, 1):
loss, hidden = forward_back_prop(rnn, optimizer, criterion,
inputs, labels, hidden)
...
...