Make the matrix multiplication operator @ work for scalars in numpy

jhin picture jhin · Dec 16, 2016 · Viewed 17.1k times · Source

In python 3.5, the @ operator was introduced for matrix multiplication, following PEP465. This is implemented e.g. in numpy as the matmul operator.

However, as proposed by the PEP, the numpy operator throws an exception when called with a scalar operand:

>>> import numpy as np
>>> np.array([[1,2],[3,4]]) @ np.array([[1,2],[3,4]])    # works
array([[ 7, 10],
       [15, 22]])
>>> 1 @ 2                                                # doesn't work
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: unsupported operand type(s) for @: 'int' and 'int'

This is a real turnoff for me, since I'm implementing numerical signal processing algorithms that should work for both scalars and matrices. The equations for both cases are mathematically exactly equivalent, which is no surprise, since "1-D x 1-D matrix multiplication" is equivalent to scalar multiplication. The current state however forces me to write duplicate code in order to handle both cases correctly.

So, given that the current state is not satisfactory, is there any reasonable way I can make the @ operator work for scalars? I thought about adding a custom __matmul__(self, other) method to scalar data types, but this seems like a lot of hassle considering the number of involved internal data types. Could I change the implementation of the __matmul__ method for numpy array data types to not throw an exception for 1x1 array operands?

And, on a sidenote, which is the rationale behind this design decision? Off the top of my head, I cannot think of any compelling reasons not to implement that operator for scalars as well.

Answer

user6655984 picture user6655984 · Dec 29, 2016

As ajcr suggested, you can work around this issue by forcing some minimal dimensionality on objects being multiplied. There are two reasonable options: atleast_1d and atleast_2d which have different results in regard to the type being returned by @: a scalar versus a 1-by-1 2D array.

x = 3
y = 5
z = np.atleast_1d(x) @ np.atleast_1d(y)   # returns 15 
z = np.atleast_2d(x) @ np.atleast_2d(y)   # returns array([[15]])

However:

  • Using atleast_2d will lead to an error if x and y are 1D-arrays that would otherwise be multiplied normally
  • Using atleast_1d will result in the product that is either a scalar or a matrix, and you don't know which.
  • Both of these are more verbose than np.dot(x, y) which would handle all of those cases.

Also, the atleast_1d version suffers from the same flaw that would also be shared by having scalar @ scalar = scalar: you don't know what can be done with the output. Will z.T or z.shape throw an error? These work for 1-by-1 matrices but not for scalars. In the setting of Python, one simply cannot ignore the distinction between scalars and 1-by-1 arrays without also giving up all the methods and properties that the latter have.