Chris
Chris

Reputation: 4505

How to avoid fragmented database storage by very often updates?

When I have the following table:

CREATE TABLE test
(
  "id" integer NOT NULL,
  "myval" text NOT NULL,
  CONSTRAINT "test-id-pkey" PRIMARY KEY ("id")
)

When doing a lot of queries like the following:

UPDATE "test" set "myval" = "myval" || 'foobar' where "id" = 12345

Then the row myval will get larger and larger over time. What will postgresql do? Where will it get the space from?

Can I avoid that postgresql needs more than one seek to read a particular myval-column?

Will postgresql do this automatically?

I know that normally I should try to normalize the data much more. But I need to read the value with one seek. Myval will enlarge by about 20 bytes with each update (that adds data). Some colums will have 1-2 updates, some 1000 updates. Normally I would just use one new row instead of an update. But then selecting is getting slow. So I came to the idea of denormalizing.

Upvotes: 1

Views: 2724

Answers (2)

Frank Heikens
Frank Heikens

Reputation: 127297

Change the FILLFACTOR of the table to create space for future updates. This can also be HOT updates because the text field doesn't have an index, to make the update faster and autovacuum overhead lower because HOT updates use a microvacuum. The CREATE TABLE statement has some information about the FILLFACTOR.

ALTER TABLE test SET (fillfactor = 70);
-- do a table rebuild to blow some space in your current table:
VACUUM FULL ANALYZE test;
-- start testing

The value 70 is not the perfect setting, it depends on your unique situation. Maybe you're fine with 90, it could also be 40 or something else.

Upvotes: 4

mu is too short
mu is too short

Reputation: 434745

This is related to this question about TEXT in PostgreSQL, or at least the answer is similar. PostgreSQL stores large columns away from the main table storage:

Very long values are also stored in background tables so that they do not interfere with rapid access to shorter column values.

So you can expect a TEXT (or BYTEA or large VARCHAR) column to always be stored away from the main table and something like SELECT id, myval FROM test WHERE id = 12345 will take two seeks to pull both columns off the disk (and more seeks to resolve their locations).

If your UPDATEs really are causing your SELECTs to slow down then perhaps you need to review your vacuuming strategy.

Upvotes: 1

Related Questions