How can a list of vectors be elegantly normalized, in NumPy?
Here is an example that does not work:
from numpy import *
vectors = array([arange(10), arange(10)]) # All x's, then all y's
norms = apply_along_axis(linalg.norm, 0, vectors)
# Now, what I was expecting would work:
print vectors.T / norms # vectors.T has 10 elements, as does norms, but this does not work
The last operation yields "shape mismatch: objects cannot be broadcast to a single shape".
How can the normalization of the 2D vectors in vectors
be elegantly done, with NumPy?
Edit: Why does the above not work while adding a dimension to norms
does work (as per my answer below)?
I came across this question and became curious about your method for normalizing. I use a different method to compute the magnitudes. Note: I also typically compute norms across the last index (rows in this case, not columns).
magnitudes = np.sqrt((vectors ** 2).sum(-1))[..., np.newaxis]
Typically, however, I just normalize like so:
vectors /= np.sqrt((vectors ** 2).sum(-1))[..., np.newaxis]
I ran a test to compare the times, and found that my method is faster by quite a bit, but Freddie Witherdon's suggestion is even faster.
import numpy as np
vectors = np.random.rand(100, 25)
# OP's
%timeit np.apply_along_axis(np.linalg.norm, 1, vectors)
# Output: 100 loops, best of 3: 2.39 ms per loop
# Mine
%timeit np.sqrt((vectors ** 2).sum(-1))[..., np.newaxis]
# Output: 10000 loops, best of 3: 13.8 us per loop
# Freddie's (from comment below)
%timeit np.sqrt(np.einsum('...i,...i', vectors, vectors))
# Output: 10000 loops, best of 3: 6.45 us per loop
Beware though, as this StackOverflow answer notes, there are some safety checks not happening with einsum
, so you should be sure that the dtype
of vectors
is sufficient to store the square of the magnitudes accurately enough.