I am using the following Nearest Neighbor Query in PostGIS :
SELECT g1.gid g2.gid FROM points as g1, polygons g2
WHERE g1.gid <> g2.gid
ORDER BY g1.gid, ST_Distance(g1.the_geom,g2.the_geom)
LIMIT k;
Now, that I have created indexes on the_geom as well as gid column on both the tables, this query is taking much more time than other spatial queries involving spatial joins b/w two tables.
Is there any better way to find K-nearest neighbors? I am using PostGIS.
And, another query which is taking a unusually long time despite creating indexes on geometry column is:
select g1.gid , g2.gid from polygons as g1 , polygons as g2
where st_area(g1.the_geom) > st_area(g2.the_geom) ;
I believe, these queries arent benefited by gist indexes, but why?
Whereas this query:
select a.polyid , sum(length(b.the_geom)) from polygon as a , roads as b
where st_intersects(a.the_geom , b.the_geom);
returns result after some time despite involving "roads" table which is much bigger than polygons or points table and also involve more complex spatial operators.
Since late September 2011, PostGIS has supported indexed nearest neighbor queries via a special operator(s) usable in the ORDER BY clause:
SELECT name, gid
FROM geonames
ORDER BY geom <-> st_setsrid(st_makepoint(-90,40),4326)
LIMIT 10;
...will return the 10 objects whose geom
is nearest -90,40
in a scalable way. A few more details (options and caveats) are in that announcement post and use of the <-> and the <#> operators is also now documented in the official PostGIS 2.0 reference. (The main difference between the two is that <->
compares the shape centroids and <#>
compares their boundaries — no difference for points, other shapes choose what is appropriate for your queries.)