I want to know how I can pad a 2D numpy array with zeros using python 2.6.6 with numpy version 1.5.0. Sorry! But these are my limitations. Therefore I cannot use np.pad
. For example, I want to pad a
with zeros such that its shape matches b
. The reason why I want to do this is so I can do:
b-a
such that
>>> a
array([[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
>>> b
array([[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.],
[ 3., 3., 3., 3., 3., 3.]])
>>> c
array([[1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0]])
The only way I can think of doing this is appending, however this seems pretty ugly. is there a cleaner solution possibly using b.shape
?
Edit, Thank you to MSeiferts answer. I had to clean it up a bit, and this is what I got:
def pad(array, reference_shape, offsets):
"""
array: Array to be padded
reference_shape: tuple of size of ndarray to create
offsets: list of offsets (number of elements must be equal to the dimension of the array)
will throw a ValueError if offsets is too big and the reference_shape cannot handle the offsets
"""
# Create an array of zeros with the reference shape
result = np.zeros(reference_shape)
# Create a list of slices from offset to offset + shape in each dimension
insertHere = [slice(offsets[dim], offsets[dim] + array.shape[dim]) for dim in range(array.ndim)]
# Insert the array in the result at the specified offsets
result[insertHere] = array
return result
NumPy 1.7.0 (when numpy.pad
was added) is pretty old now (it was released in 2013) so even though the question asked for a way without using that function I thought it could be useful to know how that could be achieved using numpy.pad
.
It's actually pretty simple:
>>> import numpy as np
>>> a = np.array([[ 1., 1., 1., 1., 1.],
... [ 1., 1., 1., 1., 1.],
... [ 1., 1., 1., 1., 1.]])
>>> np.pad(a, [(0, 1), (0, 1)], mode='constant')
array([[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
In this case I used that 0
is the default value for mode='constant'
. But it could also be specified by passing it in explicitly:
>>> np.pad(a, [(0, 1), (0, 1)], mode='constant', constant_values=0)
array([[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
Just in case the second argument ([(0, 1), (0, 1)]
) seems confusing: Each list item (in this case tuple) corresponds to a dimension and item therein represents the padding before (first element) and after (second element). So:
[(0, 1), (0, 1)]
^^^^^^------ padding for second dimension
^^^^^^-------------- padding for first dimension
^------------------ no padding at the beginning of the first axis
^--------------- pad with one "value" at the end of the first axis.
In this case the padding for the first and second axis are identical, so one could also just pass in the 2-tuple:
>>> np.pad(a, (0, 1), mode='constant')
array([[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0., 0.]])
In case the padding before and after is identical one could even omit the tuple (not applicable in this case though):
>>> np.pad(a, 1, mode='constant')
array([[ 0., 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 1., 1., 0.],
[ 0., 1., 1., 1., 1., 1., 0.],
[ 0., 1., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0., 0., 0.]])
Or if the padding before and after is identical but different for the axis, you could also omit the second argument in the inner tuples:
>>> np.pad(a, [(1, ), (2, )], mode='constant')
array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 1., 1., 1., 1., 1., 0., 0.],
[ 0., 0., 1., 1., 1., 1., 1., 0., 0.],
[ 0., 0., 1., 1., 1., 1., 1., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
However I tend to prefer to always use the explicit one, because it's just to easy to make mistakes (when NumPys expectations differ from your intentions):
>>> np.pad(a, [1, 2], mode='constant')
array([[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 1., 1., 0., 0.],
[ 0., 1., 1., 1., 1., 1., 0., 0.],
[ 0., 1., 1., 1., 1., 1., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0.]])
Here NumPy thinks you wanted to pad all axis with 1 element before and 2 elements after each axis! Even if you intended it to pad with 1 element in axis 1 and 2 elements for axis 2.
I used lists of tuples for the padding, note that this is just "my convention", you could also use lists of lists or tuples of tuples, or even tuples of arrays. NumPy just checks the length of the argument (or if it doesn't have a length) and the length of each item (or if it has a length)!