Reputation: 373
I did try to look for this answer, but no luck : (
It should be something basic, but I can't make it work...
I have 3 tables:
personID | flightID |
---|---|
1 | 587 |
51 | 44 |
1 | 37 |
... | ... |
flightID | departure | arrive |
---|---|---|
37 | 1998-06-01 | 1998-06-03 |
587 | 2022-01-01 | 2022-01-02 |
44 | 2022-01-01 | 0000-00-00 |
... | ... | ... |
personID | countryID |
---|---|
1 | 12 |
51 | 27 |
... | ... |
GOAL: select peopleID, flightID, departure, arrive, countryID WHERE countryID = 12 AND ( LAST FLIGHT was before 2000-01-01 OR LAST FLIGHT was (after 2022-03-01 AND arrive time is 0000-00-00) )
Thanks for your help!
SELECT t1.personID, flight.flightID, MAX(flight.departure), flight.arrive, country.countryID
FROM flightPerson as t1
LEFT JOIN flight
ON t1.flightID = flight.flightID
LEFT JOIN country
ON country.personID = t1.personID
WHERE country.countryID = 12 AND
flight.departure < " 2000-01-01 " OR (flight.departure > " 2022-03-01" AND flight.arrive= "0000-00-00 00:00:00")
GROUP BY personID
ORDER BY t1.personID ASC
EDIT
Based on Chris Schaller answer, I rebuilt my query and it worked for me.
Basically (as Chris suggested in the second Legacy query) you first need to find the last flight ID, then retrieve details for that flight and "get" (using the WHERE clause) only flights whose departure and arrive respect your conditions.
I post it here in case it will be useful for someone else struggling with the same problem ; )
SELECT personID
FROM (
SELECT DISTINCT t2.personID, country.countryID
,(SELECT t1.flightID
FROM flightPerson
LEFT JOIN flight as t1
ON t1.flightID = flightPerson.flightID
WHERE t2.personID = flightPerson.personID
ORDER BY `t1`.` departure` DESC
LIMIT 1
)AS lastFlightID
FROM flightPerson as t2
LEFT JOIN country
ON country.personID = t2.personID
) AS details
LEFT JOIN flight AS b
ON details. lastFlightID = b.flightID
WHERE details.countryID = 12
AND ( b.departure < "2000-01-01"
OR
(b.departure > "2022-03-01" AND b.arrive = "0000-00-00 00:00:00")
)
PS Sorry Chris, I still need to use the double brackets AND (... OR (...) )
Kind of I feel more secure this way... :D
Upvotes: 2
Views: 39
Reputation: 16554
There are 2 general solutions to this, the simplest to explain is to use the ROW_NUMBER() window query to select the Last Flight instead of a group by. In MySQL v8 we can use a CTE to help keep the query readable:
WITH PersonFlightData as (
SELECT t1.personID, flight.flightID, flight.departure, flight.arrive, country.countryID
, ROW_NUMBER() OVER(PARTITION BY t1.personID ORDER BY flight.departure DESC) as RN
FROM flightPerson as t1
LEFT JOIN flight ON t1.flightID = flight.flightID
LEFT JOIN country ON country.personID = t1.personID
WHERE country.countryID = 12
)
SELECT personID, flightID, departure, arrive, countryID
FROM PersonFlightData
WHERE RN = 1 --(filters to only include the LAST flight for each person)
AND (departure < '2000-01-01' OR departure > '2022-03-01' AND arrive = '0000-00-00')
ORDER BY personID ASC
Unfortunately, for your provided dataset, there are no results to this query. Lets remove the departure
filter to understand, in fact, lets move the departure
filter to a select column to project it into the output:
WITH PersonFlightData as (
SELECT t1.personID, flight.flightID, flight.departure, flight.arrive, country.countryID
, ROW_NUMBER() OVER(PARTITION BY t1.personID ORDER BY flight.departure DESC) as RN
FROM flightPerson as t1
LEFT JOIN flight ON t1.flightID = flight.flightID
LEFT JOIN country ON country.personID = t1.personID
WHERE country.countryID = 12
)
SELECT personID, flightID, departure, arrive, countryID
, CASE WHEN departure < '2000-01-01' OR departure > '2022-03-01' AND arrive = '0000-00-00' THEN 1 END as Output
FROM PersonFlightData
WHERE RN = 1 --(filters to only include the LAST flight for each person);
personID | flightID | departure | arrive | countryID | Output |
---|---|---|---|---|---|
1 | 587 | 2022-01-01 | 2022-01-02 | 12 |
View this proof in a fiddle: https://www.db-fiddle.com/f/jKsg1B5RjW5UhTsLbtQHwe/0 Update the schema there with additional data to see if your desired flights are included.
So the last flight for personID=1
was on 2022-01-02, which is not in the required range. personID=51
flights are excluded as their country is 27, but the date of their last flight departure, even though it still has not yet landed ;) is not in the filtered range.
For older versions, we can't use CTEs or the ROW_NUMBER() window function, so lets go back to using GROUP BY
.
The problem with a GROUP BY
, although it seems logical, is that you need to first apply the grouping to determine the Last Flight and then you need to apply your filter only to the results from the Last Flight query. That is one of the problems that using a CTE also solved for us, in this case we will have to use a nested query:
SELECT d.personID, f.flightID, f.departure, f.arrive, countryID
FROM (
SELECT t1.personID, MAX(flight.departure) AS LastFlightDeparture
FROM flightPerson as t1
LEFT JOIN flight ON t1.flightID = flight.flightID
GROUP BY personID
) d
LEFT JOIN flightPerson fp ON d.personID = fp.personID
LEFT JOIN flight f ON fp.flightID = f.flightID AND f.departure = d.LastFlightDeparture
LEFT JOIN country ON country.personID = d.personID
WHERE country.countryID = 12
AND (f.departure < '2000-01-01' OR f.departure > '2022-03-01' AND f.arrive = '0000-00-00')
ORDER BY personID ASC;
You can see in this query we only get the departure
of the Last Flight, which isn't very efficient to join back into the rest of the query, I would prefer to get the ID of the last flight and use that, but to get the ID will require a different type of sub-query that might be even more inefficient, it is certainly increasing in complexity and becoming harder to read:
SELECT personID, flightID, departure, arrive, countryID, LastFlightID
FROM (
SELECT fp.personID, fp.flightID, f.departure, f.arrive, country.countryID
, (SELECT flight.flightID
FROM flight
LEFT JOIN flightPerson ON flight.flightID = flightPerson.flightID
WHERE flightPerson.personID = fp.personID
ORDER BY departure DESC
LIMIT 1
) as LastFlightID
FROM flightPerson fp
LEFT JOIN flight f ON fp.flightID = f.flightID
LEFT JOIN country ON country.personID = fp.personID
) flightData
WHERE countryID = 12
AND flightID = LastFlightID
AND (departure < '2000-01-01' OR departure > '2022-03-01' AND arrive = '0000-00-00')
ORDER BY personID ASC;
View this fiddle in MySQL 5.6 here: http://sqlfiddle.com/#!9/a8e82d/3
OR
clause...Your original filter expression:
WHERE country.countryID = 12
AND flight.departure < " 2000-01-01 "
OR (flight.departure > " 2022-03-01" AND flight.arrive= "0000-00-00 00:00:00")
Is missing some brackets, which I think you are aware of as you had the brackets correct in your explanation:
WHERE countryID = 12 AND ( LAST FLIGHT was before 2000-01-01 OR LAST FLIGHT was (after 2022-03-01 AND arrive time is 0000-00-00) )
What I like to do (because I am an application developer first) is to use indentation to visually separate the content within brackets to make it easier to identify the sub-expressions.
WHERE countryID = 12
AND (
LAST FLIGHT was before 2000-01-01
OR LAST FLIGHT was (after 2022-03-01 AND arrive time is 0000-00-00)
)
But in this case the inner brackets are not necessary because the OR
statement will evaluate the expression on either side of the OR
independently:
WHERE country.countryID = 12
AND (flight.departure < '2000-01-01' OR flight.departure > '2022-03-01' AND flight.arrive= '0000-00-00')
Which fits on one line, but perhaps is easier to read like this:
WHERE country.countryID = 12
AND (
flight.departure < '2000-01-01'
OR flight.departure > '2022-03-01' AND flight.arrive= '0000-00-00'
)
FYI: The fact that X number of tables is involved here is not really the complicating factor. The problem is that you want to apply additional logic to the grouped results, without that logic affecting the grouping. You then also want additional metadata from the rows that correlate to the MAX/MIN or FIRST/LAST record from the grouping.
In SQL that can really only be achieved though sub-queries, whether you use CTEs or Views or Table Valued Variables, the execution is the same, we need to force the engine to evaluate one resultset before applying additional filtering criteria.
Upvotes: 1