user30455
user30455

Reputation: 19

How to count number of points in certain distance from center point using postgis?

I'm looking for a solution of following problem: I have table of thousands of point features in postgress 9.3. I would like to get rid of insignificant points. By insignificant I understand points with not many neighbors, what could be defined either by distance or by extended bbox. Preferably I'd like to count this neighbors and then decide what number of them is significant. Any advice welcome!

Upvotes: 1

Views: 1984

Answers (2)

John Powell
John Powell

Reputation: 12571

The easiest way to do this is using a spatial self-join to count the number of points from one table that intersect the buffer of each point in another table (where both tables are the same, ie, you are looking at a Cartesian product, although this is mitigated by having a spatial index -- see below). Assuming you have a table called points and your point field is called geom and each point had an ID field, and you are interested in all points within, say, 1km of each point, you would do:

select a.id, count(b.id) as num_neighbors 
from 
    points a, points b 
where st_dwithin(a.geom, b.geom, 1000) and a.id != b.id 
group by a.id 
order by num_neighbors;

Use order by and limit x to find the x points with the lowest number of neighbors.

If you wanted to use a rectangular region instead of a circular region for calculating the number of neigbors, use the ST_Expand function (in conjunction with ST_Intersects) instead of ST_DWithin. However, this will be slower, as well as less accurate, as ST_DWithin uses the index directly.

For performance reasons on non-trivial table sizes make sure you have a spatial index on your geometry field.

EDIT 2: Following your question update, I suggest you try this aproach, following the update, set, from, where order that Postgres uses for updates from a subquery.

update point SET num_neighbors=agg.num_neighbors 
   from (
      select count(b.id) as num_neighbors, a.id 
      from 
          point a, point b 
       where st_dwithin(a.geom, b.geom, 1000) and a.id != b.id group by a.id) 
   agg 
where agg.id=point.id;

Use explain in front of this query and compare the expected execution time with the query you have. I don't have your exact data, but with a test set of random points, this query was orders of magnitude faster than the one you put in your updated question.

EDIT 1: Original answer edited to take account of much better approach using ST_DWithin as suggested by Mike T.

Upvotes: 3

user30455
user30455

Reputation: 19

@John as I need to store results in my source table I slightly modified your statement (as I'm just a beginner in SQL it can be a bit childish):

UPDATE point SET num_neighbors =(
select count(b.id) as num_neighbors from 
    point a, point b 
where st_dwithin(a.geom, b.geom, 100) and a.id != b.id and a.id=point.id
group by a.id);

It takes around 200s for ~85k records which is acceptable but not ecstatic (without gist index it takes longer than for dinosaurs to become a petrol). If you see any method to speed it up (even compromising some accuracy) that would be awesome!.

regards Thomas

Upvotes: 0

Related Questions