PyTorch : predict single example

blue-sky picture blue-sky · Jun 26, 2018 · Viewed 33.3k times · Source

Following the example from:

https://github.com/jcjohnson/pytorch-examples

This code trains successfully:

# Code in file tensor/two_layer_net_tensor.py
import torch

device = torch.device('cpu')
# device = torch.device('cuda') # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data
x = torch.randn(N, D_in, device=device)
y = torch.randn(N, D_out, device=device)

# Randomly initialize weights
w1 = torch.randn(D_in, H, device=device)
w2 = torch.randn(H, D_out, device=device)

learning_rate = 1e-6
for t in range(500):
  # Forward pass: compute predicted y
  h = x.mm(w1)
  h_relu = h.clamp(min=0)
  y_pred = h_relu.mm(w2)

  # Compute and print loss; loss is a scalar, and is stored in a PyTorch Tensor
  # of shape (); we can get its value as a Python number with loss.item().
  loss = (y_pred - y).pow(2).sum()
  print(t, loss.item())

  # Backprop to compute gradients of w1 and w2 with respect to loss
  grad_y_pred = 2.0 * (y_pred - y)
  grad_w2 = h_relu.t().mm(grad_y_pred)
  grad_h_relu = grad_y_pred.mm(w2.t())
  grad_h = grad_h_relu.clone()
  grad_h[h < 0] = 0
  grad_w1 = x.t().mm(grad_h)

  # Update weights using gradient descent
  w1 -= learning_rate * grad_w1
  w2 -= learning_rate * grad_w2

How can I predict a single example ? My experience thus far is utilising feedforward networks using just numpy. After training a model I utilise forward propagation but for a single example :

numpy code snippet where new is the output value I'm attempting to predict:

new = np.asarray(toclassify) 
Z1 = np.dot(weight_layer_1, new.T) + bias_1 
sigmoid_activation_1 = sigmoid(Z1) 
Z2 = np.dot(weight_layer_2, sigmoid_activation_1) + bias_2 
sigmoid_activation_2 = sigmoid(Z2)

sigmoid_activation_2 contains the predicted vector attributes

Is the idiomatic PyTorch way same? Use forward propagation in order to make a single prediction?

Answer

AveryLiu picture AveryLiu · Jun 27, 2018

The code you posted is a simple demo trying to reveal the inner mechanism of such deep learning frameworks. These frameworks, including PyTorch, Keras, Tensorflow and many more automatically handle the forward calculation, the tracking and applying gradients for you as long as you defined the network structure. However, the code you showed still try to do these stuff manually. That's the reason why you feel cumbersome when predicting one example, because you are still doing it from scratch.

In practice, we will define a model class inherited from torch.nn.Module and initialize all the network components (like neural layer, GRU, LSTM layer etc.) in the __init__ function, and define how these components interact with the network input in the forward function.

Taken the example from the page you've provided:

# Code in file nn/two_layer_net_module.py
import torch

class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        """
        In the constructor we instantiate two nn.Linear modules and 
        assign them as
        member variables.
        """
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):
        """
        In the forward function we accept a Tensor of input data and we must return
        a Tensor of output data. We can use Modules defined in the constructor as
        well as arbitrary (differentiable) operations on Tensors.
        """
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Construct our model by instantiating the class defined above.
model = TwoLayerNet(D_in, H, D_out)

# Construct our loss function and an Optimizer. The call to 
model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
loss_fn = torch.nn.MSELoss(size_average=False)
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
for t in range(500):
    # Forward pass: Compute predicted y by passing x to the model
    y_pred = model(x)

    # Compute and print loss
    loss = loss_fn(y_pred, y)
    print(t, loss.item())

    # Zero gradients, perform a backward pass, and update the weights.
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

The code defined a model named TwoLayerNet, it initializes two linear layers in the __init__ function and further defines how these two linears interact with the input x in the forward function. Having the model defined, we can perform a single feed-forward operation simply by calling the model instance as illustrated by the end of code snippet:

y_pred = model(x)