Reputation: 23
I've got a remote ever growing TimescaleDb database. I would like to keep only the most recent entries in the that Db, backing up the rest of the data to local drive, to achieve constant Db size on the server.
I thought of making full pg_dump
backups before retaining and rebuilding the base locally from these backups.
Also, I could use WAL-E
to create a continuous copy, somehow ignoring the deletions on the remote database.
What would be the most efficient way to achieve that?
Upvotes: 1
Views: 476
Reputation: 23
So I've decided to transfer old data chunk by chunk.
First, I SELECT show_chunks(older_than => interval '1 day');
to determine chunks eligible for retain.
Next, I iterate with \copy _timescaledb_internal.name_of_the_chunk_n to 'chunk_n.csv' csv
over the selected chunks. After that I use rsync
to move csv backups to the local drive.
Finally, I've set up a shallow copy of the remote database (by hand, aware of the bug), and use timescaledb-parallel-copy -skip-header --db-name db_name --table table_name --file /path/to/chunk_n --workers 2 --reporting-period 10s
to insert data into the local Db.
Upvotes: 1
Reputation: 1902
(TimescaleDB person here)
There are two main approaches here:
drop_chunks
with your data extraction process.The answer somewhat depends on how complex your data/database is.
If you are looking to primary archive your data in a single hypertable, I would recommend the latter: Use show_chunks
to determine which chunks are over a certain range, compute a select over their range and write the data wherever, and then execute drop_chunks
over the same range.
Upvotes: 0