Reputation: 361
I have a database where I want to do a very quick and rough geohash matching. I want to find the first, most similar match by reducing the resolution of the geohash until I get a hit. The following query gets me all the geohashes I'm interested in, but I need to select the one from the below set that is the closest match to my original geohash.
SELECT business_address, geohash FROM geolocation
WHERE
geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 12), '%')) OR
geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 11), '%')) OR
geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 10), '%')) OR
geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 9), '%')) OR
geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 8), '%')) OR
geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 7), '%'));
The problem is that there is no clear way for me to order this result, or add a column that I can use to order them. Simply limiting the results to 1 and ordering by geohash isn't sufficient, as lower resolution matches might be on either side of the full resolution geohash. Is there a way for me to do this without doing 6 separate queries?
Upvotes: 2
Views: 87
Reputation: 1269443
You can use generate_series()
:
SELECT DISTINCT ON (gl.business_address) gl.business_address, gl.geohash
FROM geolocation gl JOIN
generate_series(12, 7, -1) gs(val)
ON gl.geohash LIKE SUBSTRING(:'geohash', 1, gs.val) || '%')
ORDER BY gl.business_address, gl.geohash ASC;
This uses DISTINCT ON
because that is usually faster than GROUP BY
and allows you to include additional columns.
If the values you want to compare are not really sequential numbers, use VALUES()
:
SELECT DISTINCT ON (gl.business_address) gl.business_address, gl.geohash
FROM geolocation gl JOIN
(VALUES (12), (11), (10), (9), (8), (7)) v(val)
ON gl.geohash LIKE SUBSTRING(:'geohash', 1, v.val) || '%')
ORDER BY gl.business_address, gl.geohash ASC;
Upvotes: 0
Reputation: 3833
You may do this by giving matching slno to all the matches, then find the minimum of slno to get the closest of match
WITH CTE AS (
SELECT business_address, geohash, 12 AS SLNO FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 12), '%'))
UNION ALL
SELECT business_address, geohash, 11 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 11), '%'))
UNION ALL
SELECT business_address, geohash, 10 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 10), '%'))
UNION ALL
SELECT business_address, geohash, 9 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 9), '%'))
UNION ALL
SELECT business_address, geohash, 8 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 8), '%'))
UNION ALL
SELECT business_address, geohash, 7 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 7), '%'));
)
SELECT business_address, geohash, MIN(SLNO) AS SLNO FROM CTE --- change MIN(Slno) as per your need
GROUP BY business_address, geohash
or you may change as per your need.
Upvotes: 2
Reputation: 9290
It's probably not exactly what you are looking for, but you could use a union to do this. Add a fixed column to each query, setting the priority of the returned resultset and order the final resultset from the union by this priority.
It will look like six queries (and perhaps function as such as well) but I do not think there is any other way. I'm not strong at query optimisations, but there is a chance that the end result will actually work like a single query on the background.
Upvotes: 0