Efficiently finding the closest coordinate pair from a set in Python

Kieran picture Kieran · Aug 23, 2016 · Viewed 18.6k times · Source

The Problem

Imagine I am stood in an airport. Given a geographic coordinate pair, how can one efficiently determine which airport I am stood in?

Inputs

  • A coordinate pair (x,y) representing the location I am stood at.
  • A set of coordinate pairs [(a1,b1), (a2,b2)...] where each coordinate pair represents one airport.

Desired Output

A coordinate pair (a,b) from the set of airport coordinate pairs representing the closest airport to the point (x,y).

Inefficient Solution

Here is my inefficient attempt at solving this problem. It is clearly linear in the length of the set of airports.

shortest_distance = None
shortest_distance_coordinates = None

point = (50.776435, -0.146834)

for airport in airports:
    distance = compute_distance(point, airport)
    if distance < shortest_distance or shortest_distance is None:
        shortest_distance = distance
        shortest_distance_coordinates = airport

The Question

How can this solution be improved? This might involve some way of pre-filtering the list of airports based on the coordinates of the location we are currently stood at, or sorting them in a certain order beforehand.

Answer

Juddling picture Juddling · Aug 23, 2016

Using a k-dimensional tree:

>>> from scipy import spatial
>>> airports = [(10,10),(20,20),(30,30),(40,40)]
>>> tree = spatial.KDTree(airports)
>>> tree.query([(21,21)])
(array([ 1.41421356]), array([1]))

Where 1.41421356 is the distance between the queried point and the nearest neighbour and 1 is the index of the neighbour.

See: http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.query.html#scipy.spatial.KDTree.query