Reputation: 670
Have such a problem - table with over 20mln rows.
When i add new column with default - postgresql lock table for over 40 minutes so my application stop working for this time.
So instead of
ALTER TABLE "test" ADD COLUMN "field" boolean DEFAULT True NOT NULL;
I do
ALTER TABLE "test" ADD COLUMN "field" boolean NULL;
ALTER TABLE "test" ALTER COLUMN "field" SET DEFAULT true;
after which every new row will be with true by default, so now I need to update 20mln current rows.
I update them in batches:
WITH cte AS (
SELECT id as pk
FROM "test"
WHERE "field" is null
LIMIT 10000
)
UPDATE "test" table_
SET "field" = true
FROM cte
WHERE table_.id = cte.pk
after which i do
ALTER TABLE "test" ALTER COLUMN "field" SET NOT NULL;
And everything is alright but when I update rows is too slow. Can you give my some advice about improving speed of update?
Currently it update 10000 in about 2 minutes. I try decreasing size to 1000 - it was better (3.5 minutes for 10000) but it still to slow.
I also try create index before update, but it doesnt give better results (as i understand it will give better result but when huge path of table will be updated).
Upvotes: 2
Views: 447
Reputation: 247225
You can speed up UPDATE
s with HOT. That requires that the updated column is not indexed (easy in your case) and that there is free space in the table blocks (fillfactor
< 100). Not sure if you meet the second requirement.
Another thing to improve the speed of bulk updates is to increase max_wal_size
so that checkpoints don't occur often.
Maybe you can wait a few months for PostgreSQL v11. The following new feature would do what you need:
- Allow
ALTER TABLE
to add a column with a non-null default without a table rewrite (Andrew Dunstan, Serge Rielau)
Upvotes: 3