Reputation: 3228
This is a exercise from wikibooks, #6
The question is:
For each piece, find the most expensive offering of that piece and include the piece name, provider name, and price (note that there could be two providers who supply the same piece at the most expensive price).
The solution:
SELECT Pieces.Name, Providers.Name, Price
FROM Pieces INNER JOIN Provides ON Pieces.Code = Piece
INNER JOIN Providers ON Providers.Code = Provider
WHERE Price =
(
SELECT MAX(Price) FROM Provides
WHERE Piece = Pieces.Code
);
I don't understand the subquery part. I think the typical way to find the max price is
SELECT MAX(Price) FROM Provides
group by piece;
In this way, I can not use price = SUBQUERY or price in SUBQUERY, and that sub query in solutions, looks like
SELECT MAX(Price) FROM Provides, pieces where provides.piece=pieces.code;
It just return the biggest number, I can not figure out why it can 'group' and return the right rows.
Upvotes: 0
Views: 80
Reputation: 10206
I can not figure out why it can 'group' and return the right rows
Using a group and MAX aggregate function is also a working solution.
This:
SELECT Pieces.Name, Providers.Name, MAX(Price)
FROM Pieces INNER JOIN Provides ON Pieces.Code = Piece
INNER JOIN Providers ON Providers.Code = Provider
GROUP BY Piece
Will work on MySQL and return the same results.
The correlated subquery solution basically achieve the same thing, but expressed differently. However "my" solution won't work on many RDBMS because the columns in the SELEcT
are different than those in the GROUP BY
. It is allowed in MySQL.
Maybe they prefer to specify the subquery solution because it is standard. It's a bit weird as in a real situation, a developer will most comonly choose the GROUP BY
approach, and on other RDBMS will add the columns needed to make it work
Upvotes: 1
Reputation: 48139
Although Thomas does provide a much simpler solution, lets get back to your original question, WHY does it work.
SELECT Pieces.Name, Providers.Name, Price
FROM Pieces INNER JOIN Provides ON Pieces.Code = Piece
INNER JOIN Providers ON Providers.Code = Provider
WHERE Price =
(
SELECT MAX(Price) FROM Provides
WHERE Piece = Pieces.Code
);
First, I hate working with correlated subqueries as in this answer. A correlated subquery is one that the subquery is processed once for EACH record. Notice the outer part of the query provides the "Pieces" table reference. So the inner query is saying from the "PROVIDES" table, give me the maximum price for the current "Pieces.code" value. After that, it is a simple join to the other tables to grab the piece and provider details.
My personal preference is to do a pre-aggregate subquery ONCE on the "Provides" table by all codes with its own group by. This runs the query once, grouped so there is only one record per respective code. More often you will see queries like this to prevent the larger overhead. Also, it is good to work with alias names, especially if you are dealing ex using an alias.
from LongTableNamesInYourDatabase LTN
And whenever you work with multiple tables, always try to provide the table.column or alias.column so others trying to help you in the future know where a specific column is coming from and not just guessing.
SELECT
P.Name,
Prov.Name,
MaxByPiece.MaxPrice
FROM
( SELECT
Pr1.Piece,
MAX( Pr1.Price ) as MaxPrice
FROM
Provides Pr1
group by
Pr1.Piece ) as MaxByPiece
JOIN Provides Pr2 on MaxByPiece.Piece = Pr2.Piece AND MaxByPiece.MaxPrice = Pr2.Price
JOIN Pieces P on Pr2.Piece = P.Code
JOIN Providers Prov on Pr2.Provider = Prov.Code
It may look more complex, but is more applicable if you have multiple tables with multiple rows for a given thing (contract, order, person, sales rep, whatever) and would otherwise end up with a Cartesian result and wonder why the duplicates in the counts or totals.
The first from clause query does nothing but get the maximum price per any piece and I use the alias Pr1 to differentiate that from the next join. The join after that is the provides so we can find ALL Pieces at that price. Remember the question wanted ALL Providers at that maximum price. So Now, I have the records that qualified at the max price per piece and provider. So I finish by joining to those lookup tables so I can get the names
Upvotes: 2
Reputation: 94859
A WHERE clause works thus: with all the rows got from the FROM
clause (that is Pieces-Provides-Providers combinations in your case) check whether the conditions in WHERE
are true; keep the row only if this is the case.
WHERE Price =
(
SELECT MAX(Price) FROM Provides
WHERE Piece = Pieces.Code
)
Here you take a Pieces-Provides-Providers row, use its Pieces.Code and get all the matches from Provides. Then you take the maximum price from these. This value you compare with the Price of your Pieces-Provides-Providers row. If it's the same (i.e. if your joined row has the maximum provides price for the pieces code), then you keep that row.
Upvotes: 1