Hao Huang
Hao Huang

Reputation: 221

Postgresql trigger: check total number of rows in WHEN conditon

I have a table which needs a trigger to maintain the total number of rows to be 100. The trigger works in the way that once there is a new row inserted into table, the oldest row will be deleted. Obviously, I need to check if the total number of rows already reach to 100 before the trigger start to work.

I considered to use WHEN (condition) for the trigger, and here the condition can be a subquery (SELECT count(*) FROM mytablename)>100. But problem is currently the WHEN condition does not support subqueries.

Unfortunately, I failed to find another way to count the rows in a table without writing query.

Any idea how to handle this? Is there any other way to config the trigger? Or am I supposed to check that threshold outside the trigger?

Upvotes: 3

Views: 3197

Answers (3)

Denis de Bernardy
Denis de Bernardy

Reputation: 78423

No need for a when condition, if I get your question and requirement right. You could delete the excess rows directly, if you've an appropriate sort criteria to select those that are too old, e.g.:

create function trim_tbl() returns trigger as $$
begin
  delete from tbl
  where id in (select id from tbl order by id desc offset 100);
  return null;
end;
$$ language plpgsql;

create trigger trim_tbl after insert on tbl
for each row execute procedure trim_tbl();

It could probably be a statement-level trigger, too.

That being said:

  1. As @CraigRinger suggested in the comments after his answer, you'll get better performance with a periodic cron if your interest is merely to keep your table size small.

  2. Your having only 100 rows in there essentialy guarantees a filtered seq scan plan anytime you access it because of the low cardinality, so don't get confused by this when running explain analyze.

Upvotes: 0

Daniel Vérité
Daniel Vérité

Reputation: 61516

Such conditions on other rows cannot be tested reliably in a trigger as Craig's answer explains.

However, I believe a simple approach is possible, with the help of a sequence and an additional column in the table.

Initialization:

  • create a sequence starting at 1 and cycling at 100.
  • add an int or smallint column RN for the row number

Insertion logic (a kind of merge, actually):

  • ask for the sequence's nextval (SN)
  • update the row (each column gets the value of the new row) which matches RN=SN if it exists. Think of it as "recycling" the row.
  • if the update didn't affect any row, then insert it as a new row, with RN taking SN as the value.

RN is expected to be always unique and between 1 and 100. When several transactions insert concurrently, they'll target different RN so they won't lock each other. Locking may occur if more than 100 transactions are doing this concurrently, though.

Example:

CREATE SEQUENCE cycle_seq maxvalue 100 cycle;
CREATE TABLE tst(val1 int, val2 int, RN int);

CREATE FUNCTION cycling_insert(_val1 int, _val2 int) returns void AS
$$
declare
  _rn int:=nextval('cycle_seq');
begin
   UPDATE tst SET val1=_val1,val2=_val2 WHERE RN=_rn;
   IF NOT FOUND THEN
     INSERT INTO tst VALUES(_val1,_val2,_rn);
   END IF;
END $$ language plpgsql;

Upvotes: 1

Craig Ringer
Craig Ringer

Reputation: 324375

You can't do that with a trigger WHEN condition, nor would it make sense to do so if you could. If there are two concurrent inserts, your trigger condition would run the WHEN condition twice, both would see that there are 99 rows in the table, and both would permit the next row to be inserted without a corresponding delete.

Since PostgreSQL doesn't support SQL assertions (does anything?) your best bet is a trigger that always runs on every insert. This trigger:

  • LOCKs the table IN EXCLUSIVE MODE
  • COUNTs rows
  • DELETEs the oldest row if appropriate

... however, there's a wrinkle. The LOCK, while necessary, is what's called a "lock upgrade". The transaction will always have a lock on the table already, but it'll be a weaker lock. This is a near-guaranteed way to create deadlocks between concurrent transactions in a concurrent environment because two or more transactions have a weaker lock, and they each want a stronger lock that is blocked by each others' weak locks on the table.

The only real way around that is to either use only one transaction at a time with this table, or always have transactions that use the table LOCK TABLE ... IN EXCLUSIVE MODE before doing anything with it. Both require the client application to be aware of what is going on.

In general I don't think aiming for a strictly fixed row-count in SQL tables is a great idea. You haven't explained why you want this, so it's hard for me to make suggestions about alternatives, but one possibility might be to make the client app tolerant of small increases above the row-count limit, and do lazy cleanup:

  • When there's an insert into the table, send a NOTIFY
  • Have a LISTENing program wake up when it sees the NOTIFY, grab a whole-table EXCLUSIVE MODE lock that blocks insert/update/delete but permits select, and then delete excess rows.

The handy thing about this is that NOTIFY is only sent when a transaction commits. So you won't have the same kind of locking issues.

Upvotes: 3

Related Questions