How to do product of matrices in PyTorch

blckbird picture blckbird · Jun 13, 2017 · Viewed 91.5k times · Source

In numpy I can do a simple matrix multiplication like this:

a = numpy.arange(2*3).reshape(3,2)
b = numpy.arange(2).reshape(2,1)
print(a)
print(b)
print(a.dot(b))

However, when I am trying this with PyTorch Tensors, this does not work:

a = torch.Tensor([[1, 2, 3], [1, 2, 3]]).view(-1, 2)
b = torch.Tensor([[2, 1]]).view(2, -1)
print(a)
print(a.size())

print(b)
print(b.size())

print(torch.dot(a, b))

This code throws the following error:

RuntimeError: inconsistent tensor size at /Users/soumith/code/builder/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:503

Any ideas how matrix multiplication can be conducted in PyTorch?

Answer

mbpaulus picture mbpaulus · Jun 13, 2017

You're looking for

torch.mm(a,b)

Note that torch.dot() behaves differently to np.dot(). There's been some discussion about what would be desirable here. Specifically, torch.dot() treats both a and b as 1D vectors (irrespective of their original shape) and computes their inner product. The error is thrown, because this behaviour makes your a a vector of length 6 and your b a vector of length 2; hence their inner product can't be computed. For matrix multiplication in PyTorch, use torch.mm(). Numpy's np.dot() in contrast is more flexible; it computes the inner product for 1D arrays and performs matrix multiplication for 2D arrays.

By popular demand, the function torch.matmul performs matrix multiplications if both arguments are 2D and computes their dot product if both arguments are 1D. For inputs of such dimensions, its behaviour is the same as np.dot. It also lets you do broadcasting or matrix x matrix, matrix x vector and vector x vector operations in batches. For more info, see its docs.

# 1D inputs, same as torch.dot
a = torch.rand(n)
b = torch.rand(n)
torch.matmul(a, b) # torch.Size([])

# 2D inputs, same as torch.mm
a = torch.rand(m, k)
b = torch.rand(k, j)
torch.matmul(a, b) # torch.Size([m, j])