Reputation: 251
So, I just recently asked a question: Update using a subquery with aggregates and groupby in Postgres
and it turns out I was going about my issue with flawed logic.
In the same scenario in the question above, instead of updating all the rows to have the max quantity, I'd like to delete the rows that don't have the max quantity (and any duplicate max quantities).
Essentially I need to just convert the below to a delete statement that preserves only the largest quantities per item_name. I'm guessing I'm going to need NOT EXISTS
here but I'm not sure how to do that with aggregate functions.
UPDATE transaction t
SET quantity = sub.max_quantity
FROM (
SELECT item_name, max(quantity) AS max_quantity
FROM transaction
GROUP BY 1
) sub
WHERE t.item_name = sub.item_name
AND t.quantity IS DISTINCT FROM sub.max_quantity;
Upvotes: 2
Views: 781
Reputation: 659217
Since there can be peers sharing the same maximum quantity, the safe route is a subquery with the window function row_number()
:
DELETE FROM transaction t
USING (
SELECT some_unique_id, row_number() OVER (PARTITION BY item_name
ORDER BY quantity DESC) AS rn
FROM transaction
GROUP BY 1
) sub
WHERE t.some_unique_id = sub.some_unique_id
AND sub.rn > 1;
Where some_unique_id
can be any unique column or combination of columns (mirrored in the GROUP BY
clause).
Ends up to be very similar to this question from today:
Delete rows with duplicates on two fields
If your table is big and you are going to delete large parts of it, consider advanced advice here:
How to delete duplicate entries?
Upvotes: 4