I have a collection of n dimensional points and I want to find which 2 are the closest. The best I could come up for 2 dimensions is:
from numpy import *
myArr = array( [[1, 2],
[3, 4],
[5, 6],
[7, 8]] )
n = myArr.shape[0]
cross = [[sum( ( myArr[i] - myArr[j] ) ** 2 ), i, j]
for i in xrange( n )
for j in xrange( n )
if i != j
]
print min( cross )
which gives
[8, 0, 1]
But this is too slow for large arrays. What kind of optimisation can I apply to it?
RELATED:
Euclidean distance between points in two different Numpy arrays, not within
Try scipy.spatial.distance.pdist(myArr)
. This will give you a condensed distance matrix. You can use argmin
on it and find the index of the smallest value. This can be converted into the pair information.