AttributeError: 'tuple' object has no attribute 'size'

Mattpats picture Mattpats · Jan 2, 2021 · Viewed 7.8k times · Source

UPDATE: after looking back on this question, most of the code was unnecessary. To make a long story short, the hidden layer of a Pytorch RNN needs to be a torch tensor. When I posted the question, the hidden layer was a tuple.

Below is my data loader.

from torch.utils.data import TensorDataset, DataLoader

def batch_data(log_returns, sequence_length, batch_size):
    """
    Batch the neural network data using DataLoader
    :param log_returns: asset's daily log returns
    :param sequence_length: The sequence length of each batch
    :param batch_size: The size of each batch; the number of sequences in a batch
    :return: DataLoader with batched data
    """
    
    # total number of batches we can make
    n_batches = len(log_returns)//batch_size
    
    # Keep only enough characters to make full batches
    log_returns = log_returns[:n_batches * batch_size]
    
    y_len = len(log_returns) - sequence_length
    
    x, y = [], []
    for idx in range(0, y_len):
        idx_end = sequence_length + idx
        x_batch = log_returns[idx:idx_end]
        x.append(x_batch)
        # only making predictions after the last word in the batch
        batch_y = log_returns[idx_end]    
        y.append(batch_y)    
    
    # create tensor datasets
    x_tensor = torch.from_numpy(np.asarray(x))
    y_tensor = torch.from_numpy(np.asarray(y))
    
    # make x_tensor 3-d instead of 2-d
    x_tensor = x_tensor.unsqueeze(-1)
    
    data = TensorDataset(x_tensor, y_tensor)
    
    data_loader = DataLoader(data, shuffle=False, batch_size=batch_size)
    
    # return a dataloader
    return data_loader
    def init_hidden(self, batch_size):
        ''' Initializes hidden state '''
        # Create two new tensors with sizes n_layers x batch_size x n_hidden,
        # initialized to zero, for hidden state and cell state of LSTM
        weight = next(self.parameters()).data
        
        if (train_on_gpu):
            hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda(),
                      weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda())
        else:
            hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_(),
                      weight.new(self.n_layers, batch_size, self.n_hidden).zero_())
        
        return hidden

I don't know what is wrong. When I try to start training the model, I am getting the error message:

AttributeError: 'tuple' object has no attribute 'size'

Answer

Ivan picture Ivan · Jan 3, 2021

The issue comes from the fact that hidden (in the forward definition) isn't a Torch.Tensor. Therefore, r_output, hidden = self.gru(nn_input, hidden) raises a rather confusing error without specifying exaclty what's wrong in the arguments. Altough you can see it's raised inside a nn.RNN function named check_hidden_size()...

I was confused at first, thinking that the second argument of nn.RNN: h0 was a tuple containing (hidden_state, cell_state). Same can be said ofthe second element returned by that call: hn. That's not the case h0 and hn are both Torch.Tensors. Interestingly enough though, you are able to unpack stacked tensors:

>>> z = torch.stack([torch.Tensor([1,2,3]), torch.Tensor([4,5,6])])
>>> a, b = z
>>> a, b
(tensor([1., 2., 3.]), tensor([4., 5., 6.]))

You are supposed to provide a tensor as the second argument of a nn.GRU __call__.


Edit - After further inspection of your code I found out that you are converting hidden back again to a tuple... In cell [14] you have hidden = tuple([each.data for each in hidden]). Which basically overwrites the modification you did in init_hidden with torch.stack.

Take a step back and look at the source code for RNNBase the base class for RNN modules. If the hidden state is not given to the forward it will default to:

if hx is None:
    num_directions = 2 if self.bidirectional else 1
    hx = torch.zeros(self.num_layers * num_directions,
                     max_batch_size, self.hidden_size,
                     dtype=input.dtype, device=input.device)

This is essentially the exact init as the one you are trying to implement. Granted you only want to reset the hidden states on every epoch, (I don't see why...). Anyhow, a basic alternative would be to set hidden to None at the start of an epoch, passed as it is to self.forward_back_prop then to rnn, then to self.rnn which will in turn default initialize it for you. Then overwrite hidden with the hidden state returned by that RNN forward call.

To summarize, I've only kept the relevant parts of the code. Remove the init_hidden function from AssetGRU and make those modifications:

def forward_back_prop(rnn, optimizer, criterion, inp, target, hidden):
    ...
    if hidden is not None:
        hidden = hidden.detach()
    ...
    output, hidden = rnn(inp, hidden)  
    ...
    return loss.item(), hidden


def train_rnn(rnn, batch_size, optimizer, criterion, n_epochs, show_every_n_batches):
    ...
    for epoch_i in range(1, n_epochs + 1):
        
        hidden = None
        
        for batch_i, (inputs, labels) in enumerate(train_loader, 1):
            loss, hidden = forward_back_prop(rnn, optimizer, criterion, 
                                             inputs, labels, hidden)
            ...

    ...