Reputation: 988
So i have three tables: products, offers, and offer_lines
offer_lines is a join table to make an has and belongs to many relation:
offer_lines has a column called calculated_price. This is based on the offers table's column discount_percentage and the products table's column price.
I want to create a sql statement that can calculate the discounted price of a product based on the discount percentage stored in the offers table.
This is what i got so far:
UPDATE offer_lines
SET discounted_price = (products.price - (products.price * offers.discount_percentage / 100))
FROM offer_lines AS o
INNER JOIN products ON o.product_id = products.id
INNER JOIN offers ON o.offer_id = offers.id
WHERE offer_lines.offer_id = 2;
This seems to work fine except for the fact that it takes about a minute to run.
Here is the explain:
Update on offer_lines (cost=77.16..1131.37 rows=10670 width=87)
-> Hash Join (cost=77.16..1131.37 rows=10670 width=87)
Hash Cond: (o.product_id = products.id)
-> Hash Join (cost=9.44..836.90 rows=10670 width=77)
Hash Cond: (o.offer_id = offers.id)
-> Seq Scan on offer_lines o (cost=0.00..620.74 rows=26674 width=14)
-> Hash (cost=9.38..9.38 rows=4 width=71)
-> Nested Loop (cost=0.29..9.38 rows=4 width=71)
-> Index Scan using index_offer_lines_on_offer_id on offer_lines (cost=0.29..8.30 rows=1 width=53)
Index Cond: (offer_id = 13)
-> Seq Scan on offers (cost=0.00..1.04 rows=4 width=18)
-> Hash (cost=62.88..62.88 rows=388 width=18)
-> Seq Scan on products (cost=0.00..62.88 rows=388 width=18)
Any ideas how i can make this run faster?
Upvotes: 0
Views: 44
Reputation: 1270411
First, the syntax for Postgres should not repeat the table being updated in the FROM
clause:
UPDATE offer_lines ol
SET discounted_price = (p.price - (p.price * o.discount_percentage / 100))
FROM products p, offers o
WHERE ol.offer_id = o.id AND
ol.product_id = p.id AND
ol.offer_id = 2;
(Wow, I can't believe I used a comma in the FROM
clause. Arrrgh.)
Then, for this query, you want indexes on offer_lines(offer_id, product_id)
, products(id, price)
(price is optional), and offers(id, discount_percentage)
(discount_percentage
is optional).
If I had to guess the performance problem is because order_lines
was both in the update
statement and the from
clause.
EDIT:
I should be clear. You can repeat the table in the FROM
clause. But it needs to be tied back to the version being updated:
UPDATE offer_lines ol
SET discounted_price = (p.price - (p.price * o.discount_percentage / 100))
FROM offer_lines ol2 INNER JOIN
products p
ON ol2.product_id = p.id INNER JOIN
offers o
ON ol.offer_id = o.id
WHERE ol2.offer_id = 2 AND ol.id = o.id;
This assumes that offer_lines
has a primary key column, which I've called id
. Frankly, I can see the benefits of doing the update this way, in terms of readability.
Upvotes: 1