bqui56
bqui56

Reputation: 2121

How to join many to many relationship on max value for relationship field

I have three tables:

Shop_Table
   shop_id
   shop_name

Sells_Table
   shop_id
   item_id
   price

Item_Table
   item_id
   item_name

The Sells_Table links the item and shop tables via FK's to their ids. I am trying to get the most expensive item from each store, i.e., output of the form:

(shop_name, item_name, price)
(shop_name, item_name, price)
(shop_name, item_name, price)
(shop_name, item_name, price)
...

where price is the max price item for each shop. I can seem to achieve (shop_name, max(price)) but when I try to include the item_name I am getting multiple entries for the shop_name. My current method is

create view shop_sells_item as
select s.shop_name as shop, i.item_name as item, price
from Shop_Table s
join Sells_Table on (s.shop_id = Sells_Table.shop_id)
join Item_Table i on (i.item_id = Sells_Table.item_id);

select shop, item, max(price)
from shop_sells_item
group by shop;

However, I get an error saying that item must appear in the GROUP BY clause or be used in an aggregate function, but if I include it then I don't get the max price for each shop, instead I get the max price for each shop,item pair which is of no use.

Also, is using a view the best way? could it be done via a single query?

Upvotes: 0

Views: 2321

Answers (5)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656471

Probably fastest:

SELECT s.shop_name, i.item_name, sub.price
FROM  (
   SELECT DISTINCT ON (shop_id) shop_id, item_id, price 
   FROM   sells_table
   ORDER  BY shop_id, price DESC -- , item_id -- for 1 item only
   ) sub
JOIN   item_table i USING (item_id)
JOIN   shop_table s USING (shop_id);
  • To get the most expensive item from each store you have to ORDER BY .. price DESC.

  • The above query gives you all items for a shop that share the maximum price. Can be more than one. If you always want a single item, you have to define how to break ties. For example, uncomment the additional ORDER BY item above to pick the one with minimum item_id.

  • It's generally faster to aggregate (or select) first and then join to additional tables which are not relevant for the aggregation or selection itself.

Details for DISTINCT ON in this related answer:
Select first row in each GROUP BY group?

You can easily create a VIEW from the above query, performance is about the same as from raw SQL.

Index for top performance

If you want that fast, try a multi-column index like the following:

CREATE INDEX sells_table_special_idx
ON sells_table (shop_id, price DESC, item_id)

Note the descending order on the second column.
The third column only serves to make it a covering index, which only works for PostgreSQL 9.2 or later.

Upvotes: 1

wildplasser
wildplasser

Reputation: 44240

I am trying to get the most expensive item from each store,

Definition of most expensive item for one shop := there is no item for this shop that is more expensive

SELECT *
FROM Sells_Table st
WHERE NOT EXISTS (
    SELECT * FROM Sells_Table nx
    WHERE nx.shop_id = st.shop_id
    AND nx.price > st.price
);

Now we have the sells_table entries that correspond to the items you want: the skeleton of the result set. Just add the meat :

SELECT sh.shop_name -- <<-- Meat
    , it.item_name  -- <<-- Meat
    , st.price
FROM Sells_Table st
JOIN shop_table sh ON sh.shop_id = st.shop_id  -- <<-- Meat
JOIN items_table it ON it.item_id = st.item_id -- <<-- Meat
WHERE NOT EXISTS (
    SELECT * FROM Sells_Table nx
    WHERE nx.shop_id = st.shop_id
    AND nx.price > st.price
);

Upvotes: 1

miazo
miazo

Reputation: 144

Please note that the query below doesn't deal with a situation where multiple items in one store have the same maximum price (they are all the most expensive ones):

SELECT
    s.shop_name,
    i.item_name,
    si.price
FROM
    Sells_Table si
JOIN
    Shop_Table s
ON
    si.shop_id = s.shop_id
JOIN
    Item_Table i
ON
    si.item_id = i.item_id
WHERE
    (shop_id, price) IN (
        SELECT
            shop_id,
            MAX(price) AS price_max
        FROM
            Sells_Table
        GROUP BY
            shop_id
    );

Upvotes: 1

DRapp
DRapp

Reputation: 48139

Your question sounds a bit weird, but let me see if I can rephrase it.

For each store, I want to find the most expensive thing they sell. From that Item you want the name of it, its price and the store selling it. However, you also state you are using MIN() for the minimum price... Which is it, highest or lowest price. Either way, just change the min() vs max()

Start first with just the stores and the max price (or min) item they sell -- regardless of what it it

select shop_id, max(price) grouped by the shop_id only... Use this as the basis of then getting the item(s) that match this max price... A shop could have 10 items at $199 and you have nothing identifying you only want 1 and the criteria to handle that. I am getting the

select
      FindingMax.Shop_id,
      shop.shop_name,
      FindingMax.HighestPrice,
      Item_Table.item_name
   from
      ( select st.shop_id,
               max( st.price ) as HighestPrice
           from
               sells_table st
           group by
               st.shop_id ) as FindingMax
      JOIN Shop_Table shop
         on FindingMax.shop_id = shop.shop_id
      JOIN sells_table st2
         on FindingMax.shop_id = st2.shop_id
         AND FindingMax.HighestPrice = st2.price
         JOIN Item_Table
            on st2.item_id = Item_Table.item_id

Upvotes: 0

Tomasz Myrta
Tomasz Myrta

Reputation: 1144

You can do it Postgresql way:

select distinct on (shop_name) shop_name, item_name, price 
from shop_table
join sells_table using (shop_id) 
join item_table using (item_id)
order by shop_name, price;

Upvotes: 2

Related Questions