Is it possible to map a NumPy array in place? If yes, how?
Given a_values
- 2D array - this is the bit of code that does the trick for me at the moment:
for row in range(len(a_values)):
for col in range(len(a_values[0])):
a_values[row][col] = dim(a_values[row][col])
But it's so ugly that I suspect that somewhere within NumPy there must be a function that does the same with something looking like:
a_values.map_in_place(dim)
but if something like the above exists, I've been unable to find it.
It's only worth trying to do this in-place if you are under significant space constraints. If that's the case, it is possible to speed up your code a little bit by iterating over a flattened view of the array. Since reshape
returns a new view when possible, the data itself isn't copied (unless the original has unusual structure).
I don't know of a better way to achieve bona fide in-place application of an arbitrary Python function.
>>> def flat_for(a, f):
... a = a.reshape(-1)
... for i, v in enumerate(a):
... a[i] = f(v)
...
>>> a = numpy.arange(25).reshape(5, 5)
>>> flat_for(a, lambda x: x + 5)
>>> a
array([[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]])
Some timings:
>>> a = numpy.arange(2500).reshape(50, 50)
>>> f = lambda x: x + 5
>>> %timeit flat_for(a, f)
1000 loops, best of 3: 1.86 ms per loop
It's about twice as fast as the nested loop version:
>>> a = numpy.arange(2500).reshape(50, 50)
>>> def nested_for(a, f):
... for i in range(len(a)):
... for j in range(len(a[0])):
... a[i][j] = f(a[i][j])
...
>>> %timeit nested_for(a, f)
100 loops, best of 3: 3.79 ms per loop
Of course vectorize is still faster, so if you can make a copy, use that:
>>> a = numpy.arange(2500).reshape(50, 50)
>>> g = numpy.vectorize(lambda x: x + 5)
>>> %timeit g(a)
1000 loops, best of 3: 584 us per loop
And if you can rewrite dim
using built-in ufuncs, then please, please, don't vectorize
:
>>> a = numpy.arange(2500).reshape(50, 50)
>>> %timeit a + 5
100000 loops, best of 3: 4.66 us per loop
numpy
does operations like +=
in place, just as you might expect -- so you can get the speed of a ufunc with in-place application at no cost. Sometimes it's even faster! See here for an example.
By the way, my original answer to this question, which can be viewed in its edit history, is ridiculous, and involved vectorizing over indices into a
. Not only did it have to do some funky stuff to bypass vectorize
's type-detection mechanism, it turned out to be just as slow as the nested loop version. So much for cleverness!