Say I have a list [1,2,3,4,5,6,7]
. I want to find the 3 closest numbers to, say, 6.5. Then the returned value would be [5,6,7]
.
Finding one closest number is not that tricky in python, which can be done using
min(myList, key=lambda x:abs(x-myNumber))
But I am trying not to put a loop around this to find k closest numbers. Is there a pythonic way to achieve the above task?
The heapq.nsmallest() function will do this neatly and efficiently:
>>> from heapq import nsmallest
>>> s = [1,2,3,4,5,6,7]
>>> nsmallest(3, s, key=lambda x: abs(x - 6.5))
[6, 7, 5]
Essentially this says, "Give me the three input values that have the smallest absolute difference from the number 6.5".
In the comments, @Phylliida, asked how to optimize for repeated lookups with differing start points. One approach would be to pre-sort the data and then use bisect to locate the center of a small search segment:
from bisect import bisect
def k_nearest(k, center, sorted_data):
'Return *k* members of *sorted_data* nearest to *center*'
i = bisect(sorted_data, center)
segment = sorted_data[max(i-k, 0) : i+k]
return nsmallest(k, segment, key=lambda x: abs(x - center))
For example:
>>> s.sort()
>>> k_nearest(3, 6.5, s)
[6, 7, 5]
>>> k_nearest(3, 0.5, s)
[1, 2, 3]
>>> k_nearest(3, 4.5, s)
[4, 5, 3]
>>> k_nearest(3, 5.0, s)
[5, 4, 6]