Mike Yermolayev
Mike Yermolayev

Reputation: 158

How to re-partition table with hash in PostgreSQL?

I'm currently designing a table and want to partition it by account_name. For now I'm thinking of going with a small number of partitions (e.g. 8) but since I expect a lot of data there is a chance I will need to re-partition it and make more partitions.

What is the best way to do this? If I understand correctly I can't just attach new partitions since I need to change modulus for previously used ones.

Should I copy and re-insert all the data or there is an easier way?

Upvotes: 0

Views: 2048

Answers (2)

A. Smith
A. Smith

Reputation: 487

One reasonable method that could avoid a lot of downtime, is to create a new table with the exact column schema as the old one. This table will have your new set of partitions. Load all the current data into this new table, you could even specify a timestamp column or primary key if it's sequential as a cutoff so you're aware of your delta.

Once you have the new table loaded, you can schedule downtime and detach the partitions from both tables and attach the partitions from the new table to the old table. Then you'll just do a parallel insert from the old partitions based on the timestamp or sequential key you chose, this will get you the delta of data.

This is how we do it and it seems to save a good bit of downtime for us because we're not copying over all the data during that downtime, we're preloading the data, then just doing the delta during downtime.

Upvotes: 1

Laurenz Albe
Laurenz Albe

Reputation: 247270

Repartitioning would mean to completely rewrite the table, as in

INSERT INTO new_tab SELECT * FROM old_tab;

which will cause extensive down time. One way around this is to use logical replication with new_tab on the standby side (possible from v13 on).

But my recommendation is not to do that. Choose a reasonable number of partitions and stick with that.

Upvotes: 1

Related Questions