Reputation: 21
I have a performance problem with an UPDATE WHERE id IN(). The pseudocode is (boiled down):
Start transaction from PHP:
$pdoAdapter->beginTransaction();
query MySQL for price elements that arrive from the external world and are stored in the db and which should be invoiced
SELECT id, price
FROM incoming_price_elements
WHERE <advanced where>
FOR UPDATE
Fetch a new invoice id
SELECT id
FROM invoice_id_counter
FOR UPDATE;
UPDATE invoice_id_counter SET id=id+1;
Create an invoice in PHP with a lot of summing and other aggregation and store the result
INSERT INTO invoices (id, total_price)
VALUES (<id from 3.>, <total price summed>);
Update all the incoming_price_elements to mark them invoiced
UPDATE incoming_price_elements
SET invoice_id=<id from 3.>
WHERE id IN (<all ids selected in 2.>);
Commit
My problem is that step 5 is pretty slow (in the area of seconds) and it blocks the id counter in step 3. The number of ids to transfer is more than 10.000 ids and the id is the primary key.
Any suggestions on how to optimize this? I was thinking about creating a temporary table and selecting all the ids into that, but I have zero experience in temporary tables.
Upvotes: 2
Views: 74
Reputation: 142296
Based on one of your comments, data is 'continually' being added to incoming_price_elements
? And that table is sort of a "staging area"? And one problem is that if new items are added before you finish processing, then there is trouble?
If so, then let's flip-flop.
CREATE TABLE next_icp LIKE incoming_price_elements;
RENAME TABLE incoming_price_elements TO prev_icp,
next_icp TO incoming_price_elements; -- swap in an empty table
-- now process `prev_icp` as discussed already.
As for the sluggishness, let's try to do more processing before starting the transaction.
Can you have some extra columns in incoming_price_elements
to handle the "lot of summing and other aggregation"? After that start the transaction, do the other stuff, and finally (as scaisEdge suggests) do a "multi-table UPDATE" (with a JOIN
) instead of IN
.
If that is still too invasive, then do the ids 100 at a time, not all at once. UPDATE
must save the old rows in case of a crash (etc); that gets to be costly. By breaking it up; you let other connections get some work done.
That is put a loop around steps 1..6, and focus on only 100 rows at a time, until you exhaust the staging table. This will let other ids to be grabbed in the middle; I hope that is OK.
Upvotes: 0
Reputation: 133370
instead of a IN clause you could use INNER JOIN
UPDATE incoming_price_elements a
INNER JOIN (
SELECT id
FROM incoming_price_elements
WHERE <advanced where>
) t ON t.id = a.id
SET invoice_id= <id from 3.>
this should improve your performance for this part of your "transaction"
Upvotes: 1