Reputation: 698
I am wondering what are the best practices to:
Periodically (monthly) archive a portion of a Postgresql table to a .sql
file
Then have the ability to restore that portion to the original table if necessary
For example, say I have transactions_table
that looks like this:
+----+------------+-------------+
| Id | Date | Transaction |
+----+------------+-------------+
| 1 | 2017-06-01 | 123 |
| 2 | 2017-06-02 | 120 |
| 3 | 2017-07-01 | 45 |
| 4 | 2017-08-01 | 420 |
+----+------------+-------------+
How would I go about archiving data for June of 2017 such that all data from June 2017 is removed from transactions_table
and I produce a June2017.sql
file that contains insert commands and all other information so it can easily be reintroduced to transactions_table
?
Is this possible with pg_dump
or is there a better alternative?
The Postgresql db is v9.6 and hosted by AWS. I plan to store the June2017.sql
and such either within an S3 bucket or long term storage such as AWS Glacier. I can figure that part out. Just need to know how to to archive the data appropriately.
Sorry for broad question. Any and all help is appreciated.
Upvotes: 0
Views: 536
Reputation: 37482
pg_dump
doesn't support a WHERE
out of the box. But here somebody provided a patch, that introduces this functionality. You can try if that works for you.
An alternative would be to write a function, that creates a new table, say archive_transactions_table
and pg_dump
that table.
CREATE FUNCTION create_archive_transactions_table
(_year integer,
_month integer)
RETURNS void
AS
$$
BEGIN
DROP TABLE IF EXISTS archive_transactions_table;
CREATE TABLE archive_transactions_table
AS SELECT *
FROM transactions_table
WHERE date_part('year', date) = _year
AND date_part('month', date) = _month;
END;
$$
LANGUAGE plpgsql;
You could also roll your own INSERT
statements by text concatenation. You could then select these from the transactions_table
for the rows withing the time period. Possibly put it in a function using COPY
to directly export the made up statements to a file. COPY
does support WHERE
(or a query in general) but doesn't create INSERT
statements by it's own.
CREATE FUNCTION archive_transactions_table
(_year integer,
_month integer,
_path text)
RETURNS void
AS
$$
DECLARE
_copy_command text;
BEGIN
_copy_command := 'COPY (SELECT ''INSERT INTO transactions_table (id, date, transaction) VALUES ('' || id || '', '''''' || date || '''''', '' || transaction || '');'' insert_query'
|| ' FROM transactions_table'
|| ' WHERE date_part(''year'', date) = ' || _year
|| ' AND date_part(''month'', date) = ' || _month || ')'
|| 'TO ''' || regexp_replace(_path, '([^/])$', '\1/') || _year || '-' || lpad(_month::text, 2, '0') || '.sql''';
EXECUTE _copy_command;
END;
$$
LANGUAGE plpgsql;
(Note: The path provided to COPY
cannot be an expression (like _year || '-' || _month || '.sql
). That's why the whole command has to be build up dynamically.)
Or any possible mix of the above.
Have a look at pgAgent for creating a scheduled task.
Upvotes: 1