Denis Kulagin
Denis Kulagin

Reputation: 8906

Selecting minimum within JOIN

I have two tables (listed only fields important for the question):

t_groups

t_goods

Now I need a query, which selects group names and name of the cheapest good in each group. Tried doing it this way:

SELECT gr.groupId, grname, g.name
FROM t_groups AS gr
LEFT JOIN (SELECT * FROM t_goods ORDER BY PRICE ASC LIMIT 1) AS g
ON g.groupId = gr.groupId

but it doesn't work — returns NULLs in g.name field. It could be easily explained:

SELECT within JOIN statement selects cheapest good first, and then tries to "filter it" by groupId. Obviously, it'll only work for the group cheapest good belongs to.

How do I solve the task?

Upvotes: 1

Views: 5975

Answers (2)

Jon
Jon

Reputation: 437336

Why your query does not work

SELECT gr.groupId, grname, g.name
FROM t_groups AS gr
LEFT JOIN (SELECT * FROM t_goods ORDER BY PRICE ASC LIMIT 1) AS g
ON g.groupId = gr.groupId

The inner query selects the absolutely cheapest good (irrespective of group) in your database. Therefore, when you LEFT JOIN the groups to this result set, only the group which actually includes the universally cheapest good has a matching row (that group should get the g.name column filled properly). However, due to the way LEFT JOIN works all other groups will get NULL as the value of all columns in g.

The correct solution

First, you need to select the cheapest price in each group. This is easy:

SELECT groupId, MIN(price) AS minPrice FROM t_goods GROUP BY (groupId)

However the cheapest price is not useful without the associated goodId. The problem is that it's not meaningful to write something like:

/* does not make sense, although MySql has historically allowed it */
SELECT goodId, groupId, MIN(price) AS minPrice FROM t_goods GROUP BY (groupId)

The reason is that you cannot select a non-grouped column (i.e. goodId) unless you wrap it in an aggregate function (such as MIN): we don't know which goodId you want from among those that share the same groupId.

The correct, portable way to get the goodId of the cheapest goods in each group is

SELECT goodId, temp.groupId, temp.minPrice
FROM (SELECT groupId, MIN(price) AS minPrice FROM t_goods GROUP BY groupId) temp
      JOIN t_goods ON temp.groupId = t_goods.groupId AND temp.minPrice = t_goods.price)

The above query first finds out the cheapest price per group, and then joins to the goods table again to find the goodIds of the goods having that price inside that group.

Important: if multiple goods have an equal cheapest price in a group, this query will return all of them. If you only want one result per group you have to specify the tiebreaker, for example:

SELECT MIN(goodId), temp.groupId, MIN(temp.minPrice)
FROM (SELECT groupId, MIN(price) AS minPrice FROM t_goods GROUP BY groupId) temp
      JOIN t_goods ON temp.groupId = t_goods.groupId AND temp.minPrice = t_goods.price)
GROUP BY temp.groupId

With this query in hand, you can then find the name and price of the single cheapest good in each group (lowest goodId will be used as tiebreaker):

SELECT groupId, grname, gd.name, t3.minPrice
FROM t_groups AS gr
LEFT JOIN (SELECT MIN(goodId) AS goodId, t1.groupId, MIN(t1.minPrice) AS minPrice
           FROM (SELECT groupId, MIN(price) AS minPrice FROM t_goods GROUP BY groupId) t1
                 JOIN t_goods ON t1.groupId = t_goods.groupId AND t1.minPrice = t_goods.price
           ) t2
) t3 ON gr.groupId = t3.groupId
LEFT JOIN t_goods gd ON t3.goodId = gd.goodId

This final query performs two joins at its "outer" level:

  • joins groups with the "goodId and cheapest price for each group" table to get the goodId and cheapest price
  • then joins with the goods table to get the name of the good with this goodId

It will produce only one good per group, even if multiple goods are tied for cheapest.

Upvotes: 7

Matt Fenwick
Matt Fenwick

Reputation: 49085

Here's how you could do it:

select 
  t_groups.grname as `name of group`,
  t_goods.name as `name of good`
from (
  select 
    groupId, 
    min(price) as min_price
  from t_goods 
  group by groupId
) as mins
inner join t_goods
  on mins.groupId = t_goods.groupId and mins.min_price = t_goods.price
inner join t_groups
  on mins.groupId = t_groups.groupId

How this works:

  • mins subquery gets the minimum price for each groupId
  • joining mins to t_goods pulls all of the goods out that have the minimum price in their group. Note that this could return multiple goods in a single group, if there are multiple goods with the minimum price
  • that's then joined to t_groups to get the group name

Your query was presumably returning NULLs because it was left joining to a subquery with only one row.

Upvotes: 2

Related Questions