Ben
Ben

Reputation: 361

SELECT which WHERE clause matched with result

I have a database where I want to do a very quick and rough geohash matching. I want to find the first, most similar match by reducing the resolution of the geohash until I get a hit. The following query gets me all the geohashes I'm interested in, but I need to select the one from the below set that is the closest match to my original geohash.

SELECT business_address, geohash FROM geolocation
    WHERE
        geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 12), '%')) OR
        geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 11), '%')) OR
        geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 10), '%')) OR
        geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 9), '%')) OR
        geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 8), '%')) OR
        geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 7), '%'));

The problem is that there is no clear way for me to order this result, or add a column that I can use to order them. Simply limiting the results to 1 and ordering by geohash isn't sufficient, as lower resolution matches might be on either side of the full resolution geohash. Is there a way for me to do this without doing 6 separate queries?

Upvotes: 2

Views: 87

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1269443

You can use generate_series():

SELECT DISTINCT ON (gl.business_address) gl.business_address, gl.geohash
FROM geolocation gl JOIN
     generate_series(12, 7, -1) gs(val)
     ON gl.geohash LIKE SUBSTRING(:'geohash', 1, gs.val) || '%')
ORDER BY gl.business_address, gl.geohash ASC;

This uses DISTINCT ON because that is usually faster than GROUP BY and allows you to include additional columns.

If the values you want to compare are not really sequential numbers, use VALUES():

SELECT DISTINCT ON (gl.business_address) gl.business_address, gl.geohash
FROM geolocation gl JOIN
     (VALUES (12), (11), (10), (9), (8), (7)) v(val)
     ON gl.geohash LIKE SUBSTRING(:'geohash', 1, v.val) || '%')
ORDER BY gl.business_address, gl.geohash ASC;

Upvotes: 0

DarkRob
DarkRob

Reputation: 3833

You may do this by giving matching slno to all the matches, then find the minimum of slno to get the closest of match

WITH CTE AS (
SELECT business_address, geohash, 12 AS SLNO FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 12), '%')) 
UNION ALL 
SELECT business_address, geohash, 11 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 11), '%')) 
UNION ALL 
SELECT business_address, geohash, 10 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 10), '%')) 
UNION ALL 
SELECT business_address, geohash, 9 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 9), '%')) 
UNION ALL 
SELECT business_address, geohash, 8 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 8), '%')) 
UNION ALL 
SELECT business_address, geohash, 7 FROM geolocation WHERE geohash LIKE (CONCAT(SUBSTRING(:'geohash', 1, 7), '%'));
)
SELECT business_address, geohash, MIN(SLNO) AS SLNO FROM CTE   --- change MIN(Slno) as per your need
GROUP BY business_address, geohash

or you may change as per your need.

Upvotes: 2

pyrocumulus
pyrocumulus

Reputation: 9290

It's probably not exactly what you are looking for, but you could use a union to do this. Add a fixed column to each query, setting the priority of the returned resultset and order the final resultset from the union by this priority.

It will look like six queries (and perhaps function as such as well) but I do not think there is any other way. I'm not strong at query optimisations, but there is a chance that the end result will actually work like a single query on the background.

Upvotes: 0

Related Questions