arilwan
arilwan

Reputation: 3993

Using BETWEEN SYMMETRIC for checking 2 row value expressions

I would like to compare two columns values from Table-1 (say colA, colX) to check if these are in the range of values for columns in Table-2 (say colB, colD, colY, colZ). That's, if colA is between colB-colD, and/or colX is between colY-colZ respectively.

I understand BETWEEN SYMMETRIC is used when order is irrelevant, e.g. (min, max) or (max,min). Below is the example I created.

TABLES:

CREATE TABLE location (id int, lat float, lon float, gpstime int);
CREATE TABLE trips( trip_id int, lat_start float, lat_end float,
               lon_start float, lon_end float, travel_mode text);

TABLE VALUES:

INSERT INTO location
SELECT 1, 41.2, 8.3, 11111 UNION ALL   --both lat and lon in the interval
SELECT 2, 43.3, 8.5, 22222 UNION ALL   --only lat meets criteria
SELECT 3, 44.0, 8.5,  33333            --only lon meets criteria

INSERT INTO trips
SELECT 1, 41.1, 41.9, 8.0, 8.8, 'foot' UNION ALL  
SELECT 2, 43.1, 43.5, 9.5, 8.9, 'bus' UNION ALL
SELECT 3, 42.1, 42.8, 8.7, 8.0, 'metro' 

TASK:

The goal is to get all rows for which either lat or lon (or both) in location table happen to satisfy the criteria (in trips table): lat between lat_start-lat_end; lon between lon_start- lon_end.

Clearly all rows of location table should be returned: row 1 meets all and rows 2 and 3 meet either criteria.

Expected result:

+---------+-----------+------+-----+-------------+
| trip_id | timestamp | lat  | lon | travel_mode |
+---------+-----------+------+-----+-------------+
|       1 |     11111 | 41.2 | 8.3 | foot        |
|       2 |     22222 | 43.1 | 8.5 | bus         |
|       3 |     33333 | 44.0 | 8.5 | metro       |
+---------+-----------+------+-----+-------------+

So I tried different queries using BETWEEN SYMMETRIC but the results aren't correct.

Query-1:

SELECT trip_id, gpstime AS timestamp, lat, lon,  travel_mode
FROM location 
INNER JOIN trips 
   ON (lat BETWEEN SYMMETRIC lat_start AND lat_end) 
   AND (lon BETWEEN SYMMETRIC lon_start AND lon_end) 

Query result:

+---------+-----------+------+-----+-------------+
| trip_id | timestamp | lat  | lon | travel_mode |
+---------+-----------+------+-----+-------------+
|       1 |     11111 | 41.2 | 8.3 | foot        |
+---------+-----------+------+-----+-------------+

Query-2: messed up everything

SELECT trip_id, gpstime AS timestamp, lat, lon,  travel_mode
FROM location 
INNER JOIN trips 
   ON (lat, lon) BETWEEN SYMMETRIC (lat_start,  lat_end) AND (lon_start, lon_end)

Result:

+---------+-----------+------+-----+-------------+
| trip_id | timestamp | lat  | lon | travel_mode |
+---------+-----------+------+-----+-------------+
|       2 |     11111 | 41.2 | 8.3 | foot        |
|       3 |     11111 | 41.2 | 8.3 | metro       |
+---------+-----------+------+-----+-------------+

Query-3: worst even

SELECT trip_id, gpstime AS timestamp, lat, lon,  travel_mode
FROM location 
INNER JOIN trips 
   ON lat BETWEEN SYMMETRIC lat_start AND lat_end
UNION 
SELECT trip_id, gpstime AS timestamp, lat, lon,  travel_mode
FROM location 
INNER JOIN trips 
   ON lon BETWEEN SYMMETRIC lon_start AND lon_end

Result:

+---------+-----------+------+-----+-------------+
| trip_id | timestamp | lat  | lon | travel_mode |
+---------+-----------+------+-----+-------------+
|       1 |     11111 | 41.2 | 8.3 | foot        |
|       1 |     22222 | 43.3 | 8.5 | foot        |
|       1 |     33333 | 44   | 8.5 | foot        |
|       2 |     22222 | 43.3 | 8.5 | bus         |
|       3 |     11111 | 41.2 | 8.3 | metro       |
|       3 |     22222 | 43.3 | 8.5 | metro       |
|       3 |     33333 | 44   | 8.5 | metro       |
+---------+-----------+------+-----+-------------+

Question: What is the correct query to achieve the desired result?

EDIT:

Based on answers received, recommending use of OR instead of AND in Query-1, I edit my question below to show the result of the recommended query:

QUERY:

SELECT trip_id, gpstime AS timestamp, lat, lon,  travel_mode
FROM location 
INNER JOIN trips 
   ON (lat BETWEEN SYMMETRIC lat_start AND lat_end) 
   OR (lon BETWEEN SYMMETRIC lon_start AND lon_end);

And here goes the query results:

+---------+-----------+------+-----+-------------+
| trip_id | timestamp | lat  | lon | travel_mode |
+---------+-----------+------+-----+-------------+
|       1 |     11111 | 41.2 | 8.3 | foot        |
|       3 |     11111 | 41.2 | 8.3 | metro       |
|       1 |     22222 | 43.3 | 8.5 | foot        |
|       2 |     22222 | 43.3 | 8.5 | bus         |
|       3 |     22222 | 43.3 | 8.5 | metro       |
|       1 |     33333 | 44   | 8.5 | foot        |
|       3 |     33333 | 44   | 8.5 | metro       |
+---------+-----------+------+-----+-------------+

Upvotes: 0

Views: 444

Answers (2)

Laurenz Albe
Laurenz Albe

Reputation: 246788

The query you are looking for is

SELECT trip_id, gpstime AS timestamp, lat, lon,  travel_mode
FROM location 
INNER JOIN trips 
   ON (lat BETWEEN SYMMETRIC lat_start AND lat_end) 
   OR (lon BETWEEN SYMMETRIC lon_start AND lon_end);

like Salman said, but let me help you clear up the misunderstanding.

What does BETWEEN SYMMETRIC mean?

x BETWEEN SYMMETRIC a AND b

is the same as

x BETWEEN SYMMETRIC LEAST(a, b) AND GREATEST(a, b)

So it just swaps boundaries if appropriate.

What does (a, b) < (c, d) mean?

Comparing tuples is done in a lexicographic fashion.

First, a and c are compared, and only if they are equal, b and d are compared.

So

(a, b) < (c, d)

is the same as

a < c OR (a = c AND b < d)

So that is not what you are looking for at all.

Upvotes: 0

Salman Arshad
Salman Arshad

Reputation: 272136

The goal is to get all rows for which either lat or lon (or both) in location table happen to satisfy the criteria

Replace AND with OR in your first query.

Upvotes: 1

Related Questions