Reputation: 857
Here are the details:
The database has to be archived such that records older than 6 months can be copied to a new database and deleted from the main(production) database. The complexity here will be to copy all rows in all tables which have reference each other. After that, these copied rows from some of the tables (which are really huge and whose data is no more needed) will be deleted.
The postgres database is an Amazon RDS instance.
What is the best way to achieve this?
I was thinking either a Springboot application
OR
Have postgresql.conf invoke a shell script which invokes a sql batch. For the second approach, I am not sure how to edit a amazon RDS postgresql.conf file and where to specify the shell script. Where would be the sql batch written? This is a little new to me, appreciate any pointers.
Upvotes: 2
Views: 8952
Reputation: 5606
It will be much faster if you do everything server side instead of using a Springboot application. The problem is not dump/restore which you could easily do with pg_dump utility or psql -d dbname -t -A -F";" -c "SELECT * FROM yourdata WHERE cutdate<=current_timestamp-interval '6 months'" > output.csv
But you have to guarantee that everything that is exported is loaded into the second database and that you do not delete anything that has not been exported.
I would first SELECT a subset of primary keys into a temporary table. Then use server side COPY command to export the preselected keys (and all its dependencies)
COPY (SELECT d.* FROM yourdata d INNER JOIN temporal t WHERE d.pk=t.pk) To '/tmp/yourdata.csv' WITH CSV DELIMITER ',';
After all the export files have been generated
DELETE FROM yourdata WHERE pk IN (SELECT pk FROM temporal)
Then on the backup database do
COPY yourdata(column1,column2,column3) FROM '/tmp/yourdata.csv' DELIMITER ',' CSV
You can write a script that invokes all that commands on server side using psql command line tool and last move the imported files into a permanent location just in case something went wrong and you need to process them again.
See Save PL/pgSQL output from PostgreSQL to a CSV file and How to import CSV file data into a PostgreSQL table?
Upvotes: 3