sliding window of M-by-N shape numpy.ndarray

siamii picture siamii · Mar 30, 2013 · Viewed 53k times · Source

I have a numpy array of shape (6,2)

[[00,01],
 [10,11],
 [20,21],
 [30,31],
 [40,41],
 [50,51]]

I need a sliding window with step size 1 and window size 3 likes this:

[[00,01,10,11,20,21],
 [10,11,20,21,30,31],
 [20,21,30,31,40,41],
 [30,31,40,41,50,51]]

I'm looking for a numpy solution. If your solution could parametrize the the shape of the original array as well as the window size and step size, that'd great.

I found this related answer Using strides for an efficient moving average filter but I don't see how to specify the stepsize there and how to collapse the window from the 3d to a continuous 2d array. Also this Rolling or sliding window iterator in Python but that's in Python and I'm not sure how efficient that is. Also, it supports elements but does not join them together in the end if each element has multiple features.

Answer

user42541 picture user42541 · Feb 15, 2017

You can do a vectorized sliding window in numpy using fancy indexing.

>>> import numpy as np

>>> a = np.array([[00,01], [10,11], [20,21], [30,31], [40,41], [50,51]])

>>> a
array([[ 0,  1],
       [10, 11],
       [20, 21],                      #define our 2d numpy array
       [30, 31],
       [40, 41],
       [50, 51]])

>>> a = a.flatten()

>>> a
array([ 0,  1, 10, 11, 20, 21, 30, 31, 40, 41, 50, 51])    #flattened numpy array

>>> indexer = np.arange(6)[None, :] + 2*np.arange(4)[:, None]

>>> indexer
array([[ 0,  1,  2,  3,  4,  5],
       [ 2,  3,  4,  5,  6,  7],            #sliding window indices
       [ 4,  5,  6,  7,  8,  9],
       [ 6,  7,  8,  9, 10, 11]])

>>> a[indexer]
array([[ 0,  1, 10, 11, 20, 21],
       [10, 11, 20, 21, 30, 31],            #values of a over sliding window
       [20, 21, 30, 31, 40, 41],
       [30, 31, 40, 41, 50, 51]])

>>> np.sum(a[indexer], axis=1)
array([ 63, 123, 183, 243])         #sum of values in 'a' under the sliding window.

Explanation for what this code is doing.

The np.arange(6)[None, :] creates a row vector 0 through 6, and np.arange(4)[:, None] creates a column vector 0 through 4. This results in a 4x6 matrix where each row (six of them) represents a window, and the number of rows (four of them) represents the number of windows. The multiple of 2 makes the sliding window slide 2 units at a time which is necessary for sliding over each tuple. Using numpy array slicing you can pass the sliding window into the flattened numpy array and do aggregates on them like sum.