Identifying points with the smallest Euclidean distance

python algorithm numpy nearest-neighbor euclidean-distance

Ηλίας · Feb 25, 2011 · Viewed 8.5k times · Source

I have a collection of n dimensional points and I want to find which 2 are the closest. The best I could come up for 2 dimensions is:

from numpy import *
myArr = array( [[1, 2],
                [3, 4],
                [5, 6],
                [7, 8]] )

n = myArr.shape[0]
cross = [[sum( ( myArr[i] - myArr[j] ) ** 2 ), i, j]
         for i in xrange( n )
         for j in xrange( n )
         if i != j
         ]

print min( cross )

which gives

[8, 0, 1]

But this is too slow for large arrays. What kind of optimisation can I apply to it?

Euclidean distance between points in two different Numpy arrays, not within

Answer

Try scipy.spatial.distance.pdist(myArr). This will give you a condensed distance matrix. You can use argmin on it and find the index of the smallest value. This can be converted into the pair information.

Identifying points with the smallest Euclidean distance

Answer

Related questions