francs
francs

Reputation: 9219

expand a varchar column very slowly , why?

Hi We need to modify a column of a big product table , usually normall ddl statments will be excutely fast ,but the above ddl statmens takes about 10 minnutes。I wonder know the reason! I just want to expand a varchar column。The following is the detailsl

--table size
wapreader_log=> select pg_size_pretty(pg_relation_size('log_foot_mark'));
 pg_size_pretty 
----------------
 5441 MB
(1 row)


--table ddl
wapreader_log=> \d log_foot_mark
          Table "wapreader_log.log_foot_mark"
   Column    |            Type             | Modifiers 
-------------+-----------------------------+-----------
 id          | integer                     | not null
 create_time | timestamp without time zone | 
 sky_id      | integer                     | 
 url         | character varying(1000)     | 
 refer_url   | character varying(1000)     | 
 source      | character varying(64)       | 
 users       | character varying(64)       | 
 userm       | character varying(64)       | 
 usert       | character varying(64)       | 
 ip          | character varying(32)       | 
 module      | character varying(64)       | 
 resource_id | character varying(100)      | 
 user_agent  | character varying(128)      | 
Indexes:
    "pk_log_footmark" PRIMARY KEY, btree (id)


--alter column
wapreader_log=> \timing
Timing is on.

wapreader_log=>  ALTER TABLE wapreader_log.log_foot_mark ALTER column user_agent TYPE character varying(256); 
ALTER TABLE
Time: 603504.835 ms    

Upvotes: 0

Views: 2336

Answers (5)

user998303
user998303

Reputation: 156

This issue is no longer valid.

Since at least Postgres v7.2.8 (2005) varchar(n) fields were stored as 4 bytes plus the actual compressed string. https://www.postgresql.org/docs/7.2/datatype-character.html. Later versions also introduced a short string which is stored as 1 byte plus the actual string. So, unlike character(n), the maximum length of a varchar(n) field does not affect its actual storage. So increasing the maximum size of a varchar(n) field does not require a table rewrite.

It may be that at the time of the original post Postgres did (needlessly?) save a copy of the table when the maximum size of a varchar(n) was altered. However, since at least Postgres v9.3 (2013) it does not.

Upvotes: 0

Kuberchaun
Kuberchaun

Reputation: 30362

Not sure if this is any faster, but it may be you will have to test it out. Try this until PostgreSQL can handle the type of alter you want without re-writing the entire stinking table.

ALTER TABLE log_foot_mark RENAME refer_url TO refer_url_old;
ALTER TABLE log_foot_mark ADD COLUMN refer_url character varying(256);

Then using the indexed primary key or unique key of the table do a looping transaction. I think you will have to do this via Perl or some language that you can do a commit every loop iteration.

WHILE (end < MAX_RECORDS)LOOP

BEGIN TRANSACTION;
UPDATE log_foot_mark
SET refer_url = refer_url_old
WHERE id >= start AND id <= end;

COMMIT TRANSACTION;
END LOOP;

ALTER TABLE log_foot_mark DROP COLUMN refer_url_old;

Keep in mind that loop logic will need to be in something other than PL\PGSQL to get it to commit every loop iteration. Test it with no loop at all and looping with a transaction size of 10k 20k 30k etc until you find the sweet spot.

Upvotes: 0

Žiga
Žiga

Reputation: 79

One way to avoid a table rewrite is to use SQL domains (see CREATE DOMAIN) instead of varchars in your table. You can then add and remove constraints on a domain.

Note that this does not work instantly either, since all tables using the domain are checked for constraint validity, but it is less expensive than full table rewrite and it doesn't need the extra disk space.

Upvotes: 0

Greg Smith
Greg Smith

Reputation: 18186

When you alter a table, PostgreSQL has to make sure the old version doesn't go away in some cases, to allow rolling back the change if the server crashes before it's committed and/or written to disk. For those reasons, what it actually does here even on what seems to be a trivial change is write out a whole new copy of the table somewhere else first. When that's finished, it then swaps over to the new one. Note that when this happens, you'll need enough disk space to hold both copies as well.

There are some types of DDL changes that can be made without making a second copy of the table, but this is not one of them. For example, you can add a new column that defaults to NULL quickly. But adding a new column with a non-NULL default requires making a new copy instead.

Upvotes: 3

Frank Heikens
Frank Heikens

Reputation: 127396

ALTER ... TYPE requires a complete table rewrite, that's why it might take some time to complete on large tables. If you don't need a length constraint, than don't use the constraint. Drop these constraints once and and for all, and you will never run into new problems because of obsolete constraints. Just use TEXT or VARCHAR.

Upvotes: 5

Related Questions