BangolPhoenix
BangolPhoenix

Reputation: 393

where clause equality understanding

I have difficulty understanding this WHERE clause including equality condition in terms of algorithmic process. (I suppose I can't get the SQL grammar below since I can't really picture the process in which WHERE carrier= f.carrier clause is run step-by-step like in algorithm. (like I see the result set table and how it's processed in order))

SELECT id 
FROM flights AS f
WHERE distance < ( SELECT AVG(distance) 
                   FROM flights
                   WHERE carrier = f.carrier);

So, as I understand it, the process of running carrier=f.carrier goes like:

  1. Since it's a correlated subquery, the computer starts from outer subquery, meaning it takes just first row in a flight table and then goes on to inner subquery.
  2. In a inner subquery, it tries to find the same carrier of outer subquery's first row data and find the average of that specific carrier's distance. And it compares the average with the outer subquery's first row record and return the outcome. It does this until it finishes the whole table.

Is my understanding of sequence process correct? Thank you so much for reading!

Upvotes: 0

Views: 115

Answers (3)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522234

Your explanation is correct. Just for educational purposes, here is how you could do the same query using a join:

SELECT t1.id
FROM flights t1
INNER JOIN
(
    SELECT carrier, AVG(distanct) AS average_distance
    FROM flights
    GROUP BY carrier
) t2
    ON t1.carrier = t2.carrier
WHERE t1.distance < t2.average_distance

The non-correlated subquery finds the average distance for each carrier, and these values are then joined back to the original flights table. The WHERE clause compares the distance values from flights against the average values for the same carrier to decide which records to retain in the result set.

Upvotes: 1

d512
d512

Reputation: 34243

Your explanation is correct. That's basically what it's going to do, at least conceptually. If there's one thing that databases are good at, it's optimizations. That means that it may not actually work the way you think it might, even if you get the results you expect.

The best way to see what it's actually doing is to look at the query plan, which are the actual steps taken by the database to run the query. Most databases provide tools for viewing these. This is often a good idea when your query is not running as fast as you think it should.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270573

Your explanation is correct. However, if you are writing queries with more than one table, you should get in the habit of qualifying all column names. This is especially important with correlated subqueries, where mistakes can be very subtle:

SELECT f.id 
FROM flights f
WHERE f.distance < (SELECT AVG(f2.distance) 
                    FROM flights f2
                    WHERE f2.carrier = f.carrier
                   );

Upvotes: 0

Related Questions