Most optimal way to backup postgresql database

Question

I have a medium-sized database (postgresql 9.6) with moderate traffic. The database is located on a virtual server, described as having 4cpu cores and 8192mb of ram.

Currently I am backing up the server every hour, using pg_dump on the server. This process can take some time as you'd expect, but the reason for this question is that the process eats up a lot of CPU, meaning that we're regularly seeing degraded performance throughout the day.

Our pg_dump is run like so, to generate a dump for each table individually, as well as a single dump of all tables:

for table in $(psql -d "XXX" -t -c "SELECT table_name FROM information_schema.tables WHERE table_type = 'BASE TABLE' AND table_schema = 'public'");
    do pg_dump -Fc -t $table -d "XXX" > $1/$table.bak;
done;
pg_dump -Fc -d "XXX" > $1/all_tables.bak;

So my question is: how can I optimize the backup process? Ideally, I am looking for the most optimal process in terms of CPU.

I have tried a few things so far, such as trying to offload the dump process to another server but I'm finding limited results...

Any suggestions would be greatly appreciated!

jjanes · Accepted Answer

If you want to have backup with an hourly granularity, you should probably use pg_basebackup and WAL archiving (or streaming, with archival from the replica) to create a physical backup, rather than pg_dump to create logical ones. You can then use PITR to restore to almost any time point you want. You will have to take a new basebackup occasionally to keep restore time down, but almost certainly not every hour. Also, pg_basebackup has a low CPU load (apart from compression, but that is done on the local side not the database side if you run pg_basebackup over the network).

Most optimal way to backup postgresql database

Answers (1)

Related Questions