Reputation: 117
I'm developing a gateway application and I am using a PostgreSQL to store received data before forwarding them. To avoid saturating the disk, I would like to remove all the tuples older than a certain time (in this moment I am thinking about 30 or 60 days).
To implement that, at the moment, I wrote a trigger that, after each INSERT statement in a certain table, remove all the entries older than this particular amount of time (you can find it below). It seems working but I am a little bit afraid of what can happen with a higher rate of date.
DROP TRIGGER IF EXISTS delete_old_measures
ON my_table_1;
CREATE OR REPLACE FUNCTION dropOldMeasures() RETURNS TRIGGER AS $$
BEGIN
DELETE FROM my_table_1 WHERE sqltime < now()-'30 day'::interval;
RETURN NULL;
END;
$$
LANGUAGE 'plpgsql';
CREATE TRIGGER delete_old_measures
AFTER INSERT
ON my_table_1
FOR EACH ROW EXECUTE PROCEDURE dropOldMeasures();
My question is: is there a smarter way to remove old tuples automatically from a database?
Upvotes: 0
Views: 715
Reputation: 1269873
DELETE
is not a great way to delete lots of records because it incurs a lot of overhead.
For large tables, an approach using table partitioning is often used. Basically, table partitioning means that a single table is stored in multiple different files, based on a partitioning key. In your case, the partitioning key would be based on sqltime
-- depending on your needs it could be one hour or one day or one week or one month or whatever.
The idea then is that you can drop partitions very easily and on a schedule. Dropping a partition is much less intensive on the database than deletes.
Upvotes: 3
Reputation: 44192
Even if you do want to do it with a trigger, there is surely no reason to do it FOR EACH ROW. Once per statement is more than enough.
If you don't know what I am talking about, have a look to the PosgreSQL documentation for triggers or, even better, to the page that explain how to create a trigger.
Better would be learn how to use a scheduler, like cron
on Linux or Task Scheduler
on Windows, to schedule this operation periodically.
Space freed up by DELETE is not available for reuse by INSERTs until a vacuum gets run. So unless you are going to run "manual" VACUUMs on the table (or use partitioning instead), there is probably not much point in deleting in batches smaller than autovacuum_vacuum_scale_factor, which is 0.2 by default.
Upvotes: 2