Tisha
Tisha

Reputation: 857

What is the best way to archive a postgres database?

Here are the details:

The database has to be archived such that records older than 6 months can be copied to a new database and deleted from the main(production) database. The complexity here will be to copy all rows in all tables which have reference each other. After that, these copied rows from some of the tables (which are really huge and whose data is no more needed) will be deleted.

The postgres database is an Amazon RDS instance.

What is the best way to achieve this?

I was thinking either a Springboot application

OR

Have postgresql.conf invoke a shell script which invokes a sql batch. For the second approach, I am not sure how to edit a amazon RDS postgresql.conf file and where to specify the shell script. Where would be the sql batch written? This is a little new to me, appreciate any pointers.

Upvotes: 2

Views: 8952

Answers (1)

Serg M Ten
Serg M Ten

Reputation: 5606

It will be much faster if you do everything server side instead of using a Springboot application. The problem is not dump/restore which you could easily do with pg_dump utility or psql -d dbname -t -A -F";" -c "SELECT * FROM yourdata WHERE cutdate<=current_timestamp-interval '6 months'" > output.csv

But you have to guarantee that everything that is exported is loaded into the second database and that you do not delete anything that has not been exported.

I would first SELECT a subset of primary keys into a temporary table. Then use server side COPY command to export the preselected keys (and all its dependencies)

COPY (SELECT d.* FROM yourdata d INNER JOIN temporal t WHERE d.pk=t.pk) To '/tmp/yourdata.csv' WITH CSV DELIMITER ',';

After all the export files have been generated

DELETE FROM yourdata WHERE pk IN (SELECT pk FROM temporal)

Then on the backup database do

COPY yourdata(column1,column2,column3) FROM '/tmp/yourdata.csv' DELIMITER ',' CSV

You can write a script that invokes all that commands on server side using psql command line tool and last move the imported files into a permanent location just in case something went wrong and you need to process them again.

See Save PL/pgSQL output from PostgreSQL to a CSV file and How to import CSV file data into a PostgreSQL table?

Upvotes: 3

Related Questions