RandomWhiteTrash
RandomWhiteTrash

Reputation: 4014

Postgres 9 super slow simple delete

I have a substantial database... not a very large one - around 1gb of data in total.

I need to delete some rows from several tables. For example I have a table

 Order
 id     | ... | status   | ...
 1      | ... | 1        | ...
 ...
 40     | ... | 20       | ...
 41     | ... | 1        | ...
 ...
 470000 | ... | 12       | ...

Now I want to delete all orders that have status=1

I presume I do it with:

DELETE FROM Order WHERE status=1

All nice and simple it would seem but it takes ages! When I have run this query it was still running at 100% CPU usage after 40 min... when I killed the process nothing was deleted.

When I tried limiting the scope by using

DELETE FROM Order WHERE status=1 AND id<1000

it took couple of minutes to delete some 200 rows....

Is there anything I am missing in my configuration? Anything I should look for/check/change? Any ideas at all why its so bloody inefficient?

Let me add that I am usually working with MySQL and need to manage this postgres database but don't really have any experience with postgres, so it may be something very simple.

Indexes are on both id and status columns.

Table has some 500k rows, around half needs to be deleted.

Execution plan:

Delete  (cost=0.00..19474.19 rows=266518 width=6)
->  Seq Scan on Orders  (cost=0.00..19474.19 rows=266518 width=6)
Filter: (statusid = 1)

There are no triggers or rules of any sort. What's more, I didn't add this is a fresh copy of the table, I mean it was moved from other server with export/import. Perhaps this plays a role somehow?

Will deleting indexes help?

Upvotes: 9

Views: 11296

Answers (2)

Petr
Petr

Reputation: 1179

It is much quicker use COPY FROM / TURNCATE /COPY TO on big tables.

But, of course, you have to be careful with references and if possible, disable triggers.

Upvotes: 3

evil otto
evil otto

Reputation: 10582

Having nothing deleted after you killed the process is EXACTLY what you should see.

The delete happens as a transaction, meaning that either everything is deleted, or nothing is. To make sure that this can happen, the rows need to be copied somewhere before deleting them. Which means that deleting 250k rows takes about as much time as inserting that many rows. In some cases it can be faster to create a new table with everything NOT deleted and rename the new table to the old one.

If this is being moved from another db, you will probably be better off if you can keep the 250k rows from being inserted in the first place.

(This is general RDBMS wisdom, not postgresql specific - the details of how postgres MVCC works may differ significantly.)

Upvotes: 10

Related Questions