Reputation: 135
I know the title is worded poorly, but I couldn't think of a better way to say it.
I'm learning Ruby and refreshing on MySQL. I'm using a historical list of completed flights as a practice data set, with about 100,000 rows to work with. Each flight record includes the origin and destination airports (fields 'origin' and 'dest'), and the total flight distance (field 'distance').
As an exercise I want to show the 10 longest routes sorted by distance descending. However, I want to consider each pair of endpoints as a single route, regardless of which is the origin and which is the destination. So for example JFK-LAX and LAX-JFK should be considered a single route.
When I run the query:
SELECT DISTINCT distance, origin, dest FROM flights ORDER BY distance DESC LIMIT 10;
of course I get this:
["2704", "BOS", "SFO"]
["2704", "SFO", "BOS"]
["2689", "BOS", "SJC"]
["2689", "SJC", "BOS"]
["2615", "LAX", "LIH"]
["2615", "LIH", "LAX"]
["2614", "HNL", "SAN"]
["2614", "SAN", "HNL"]
["2611", "BOS", "LAX"]
["2611", "LAX", "BOS"]
which is not what I want. I want to say, "Select the distance and endpoints of the 10 longest routes regardless of whether the airports are origins or destinations."
One thought I had was to sort each pair of endpoints alphabetically and join them together to create the unique route, e.g., LAX and JFK = "JFKLAX". But I don't know how to do that and pass it to my original query, or even if that's the best way to go about this.
Can this be done purely in SQL / MySQL?
Upvotes: 3
Views: 886
Reputation: 270609
One simple way to approach this is to use GREATEST()
and LEAST()
return whichever value of those two columns sorts higher or lower according to the columns' collation. They then always return in the same position, and the DISTINCT
will deduplicate them.
SELECT DISTINCT
distance,
LEAST(origin, dest) AS endpoint1,
GREATEST(origin, dest) AS endpoint2
FROM flights f
ORDER BY distance DESC LIMIT 10
Here it is in action on sqlfiddle
For example, LEAST('BOS', 'SFO')
will always return 'BOS'
, while GREATEST('BOS', 'SFO')
will always return 'SFO'
. No matter the order, when the rows are juxtaposed the result is the same so the DISTINCT
will apply correctly.
Upvotes: 3