Most efficient way to find mode in numpy array

Nik picture Nik · May 2, 2013 · Viewed 161.3k times · Source

I have a 2D array containing integers (both positive or negative). Each row represents the values over time for a particular spatial site, whereas each column represents values for various spatial sites for a given time.

So if the array is like:

1 3 4 2 2 7
5 2 2 1 4 1
3 3 2 2 1 1

The result should be

1 3 2 2 2 1

Note that when there are multiple values for mode, any one (selected randomly) may be set as mode.

I can iterate over the columns finding mode one at a time but I was hoping numpy might have some in-built function to do that. Or if there is a trick to find that efficiently without looping.

Answer

fgb picture fgb · May 2, 2013

Check scipy.stats.mode() (inspired by @tom10's comment):

import numpy as np
from scipy import stats

a = np.array([[1, 3, 4, 2, 2, 7],
              [5, 2, 2, 1, 4, 1],
              [3, 3, 2, 2, 1, 1]])

m = stats.mode(a)
print(m)

Output:

ModeResult(mode=array([[1, 3, 2, 2, 1, 1]]), count=array([[1, 2, 2, 2, 1, 2]]))

As you can see, it returns both the mode as well as the counts. You can select the modes directly via m[0]:

print(m[0])

Output:

[[1 3 2 2 1 1]]