Most efficient way to query for lat-long rectangle in SQL

Question

I'm currently making consistent queries for a block of land within a given latitude, longitude rectangle. The coordinates are stored as individual double precision values. I've created a single index of both columns, so the current query containing 15240 tiles takes .10 seconds on my local machine.

At the moment, there's 23 million rows in the table, but there's going to be around 800 million upon completion of the table, so I expect this query time to get much slower.

Here's the query I'm running, with example values:

SELECT * FROM territories
WHERE nwlat < 47.606977 and nwlat > 47.506977
and   nwlng < -122.232991 and nwlng > -122.338991;

Is there a more efficient way of doing this? I'm fairly new to large databases, so any help is appreciated. FYI, I'm using PostgreSQL.

Erwin Brandstetter · Accepted Answer

It would be much more efficient with a GiST or an SP-GiST index and a "box-contains-points" query ...

GiST index

The index is on a box with zero area, built from the same point (point(nwlat, nwlng)) twice.

There is a related code example in the manual for CREATE INDEX.

CREATE INDEX territories_box_gist_idx ON territories
USING gist (box(point(nwlat, nwlng), point(nwlat, nwlng)));

Query with the "overlaps" operator &&:

SELECT *
FROM   territories
WHERE  box(point(nwlat, nwlng), point(nwlat, nwlng))
    && '(47.606977, -122.232991), (47.506977, -122.338991)'::box;

SP-GiST index

Smaller index on just points:

CREATE INDEX territories_box_spgist_idx ON territories
USING spgist (point(nwlat, nwlng));

Query with the contains operator @>:

SELECT *
FROM   point
WHERE  '(47.606977, -122.232991), (47.506977, -122.338991)'::box
    @> point(nwlat, nwlng);

I get fastest results for the SP-GiST index in a simple test with 1M rows on Postgres 9.6.1.

For more sophisticated needs consider the PostGIS extension.

Most efficient way to query for lat-long rectangle in SQL

Answers (1)

GiST index

SP-GiST index

Related Questions