If you are given say a list of 10 vectors, called A that represent different groups. Then you have a time series of vectors v1,v2,...,vn, each being a vector as well. I was wondering if there was a way to find the "closest" vector in A for each v1,v2,...,vn if you define some distance metric?
Is there a quick way to do this besides for looping through and just comparing all entries?
Edit: No I am not asking how to do k-means or something like that.
You can use the spatial KDtree in scipy. It uses a fast tree algorithm to identify close by points for vectors of arbitrary dimension.
Edit: sorry, if you are looking for arbitrary distance metrics, a Tree like structure might still be an option.
Here is an example:
>>> from scipy import spatial
>>> A = [[0,1,2,3,4], [4,3,2,1,0], [2,5,3,7,1], [1,0,1,0,1]]
>>> tree = spatial.KDTree(A)
This sets up the KDTree with all the points in A, allowing you to perform fast spatial searches within it. Such a query takes a vector and returns the closest neighbor in A for it:
>>> tree.query([0.5,0.5,0.5,0.5,0.5])
(1.1180339887498949, 3)
The first return value is the distance of the closest neighbor and the second its position in A, such that you can obtain it for example like this:
>>> A[ tree.query([0.5,0.5,0.5,0.5,0.5])[1] ]
[1, 0, 1, 0, 1]