Reputation: 1277
Here is my structure (with values):
user_eval_history table
user_eval_id | user_id | is_good_eval
--------------+---------+--------------
1 | 1 | t
2 | 1 | t
3 | 1 | f
4 | 2 | t
user_metrics table
user_metrics_id | user_id | nb_good_eval | nb_bad_eval
-----------------+---------+--------------+-------------
1 | 1 | 2 | 1
2 | 2 | 1 | 0
For access time (performance) reasons I want to avoid recomputing user evaluation from the history again and again. I would like to store/update the sums of evaluations (for a given user) everytime a new evaluation is given to the user (meaning everytime there is an INSERT in the user_eval_history table I want to update the user_metrics table for the corresponding user_id).
I feel like I can achieve this with a trigger and a stored procedure but I'm not able to find the correct syntax for this.
I think I need to do what follows:
1. Create a trigger on user metrics:
CREATE TRIGGER update_user_metrics_trigger AFTER INSERT
ON user_eval_history
FOR EACH ROW
EXECUTE PROCEDURE update_user_metrics('user_id');
2. Create a stored procedure update_user_metrics that
2.1 Computes the metrics from the user_eval_history table for user_id
SELECT
user_id,
SUM( CASE WHEN is_good_eval='t' THEN 1 ELSE 0) as nb_good_eval,
SUM( CASE WHEN is_good_eval='f' THEN 1 ELSE 0) as nb_bad_eval
FROM user_eval_history
WHERE user_id = 'user_id' -- don't know the syntax here
2.2.1 Creates the entry into user_metrics if not already existing
INSERT INTO user_metrics
(user_id, nb_good_eval, nb_bad_eval) VALUES
(user_id, nb_good_eval, nb_bad_eval) -- Syntax?????
2.2.2 Updates the user_metrics entry if already existing
UPDATE user_metrics SET
(user_id, nb_good_eval, nb_bad_eval) = (user_id, nb_good_eval, nb_bad_eval)
I think I'm close to what is needed but don't know how to achieve this. Especially I don't know about the syntax.
Any idea?
Note: Please, no "RTFM" answers, I looked up for hours and didn't find anything but trivial examples.
Upvotes: 3
Views: 4827
Reputation: 656804
First, revisit the assumption that maintaining an always current materialized view is a significant performance gain. You add a lot of overhead and make writes to user_eval_history
a lot more expensive. The approach only makes sense if writes are rare while reads are more common. Else, consider a VIEW
instead, which is more expensive for reads, but always current. With appropriate indexes on user_eval_history
this may be cheaper overall.
Next, consider an actual MATERIALIZED VIEW
(Postgres 9.3+) for user_metrics
instead of keeping it up to date manually, especially if write operations to user_eval_history
are very rare. The tricky part is when to refresh the MV.
Your approach makes sense if you are somewhere in between, user_eval_history
has a non-trivial size and you need user_metrics
to reflect the current state exactly and close to real-time.
Still on board? OK. First you need to define exactly what's allowed / possible and what's not. Can rows in user_eval_history
be deleted? Can the last row of a user in user_eval_history
be deleted? Probably yes, even if you would answer "No". Can rows in user_eval_history
be updated? Can user_id
be changed? Can is_good_eval
be changed? If yes, you need to prepare for each of these cases.
Assuming the trivial case: INSERT
only. No UPDATE
, no DELETE
. There is still the possible race condition you have been discussing with @sn00k4h. You found an answer to that, but that's really for INSERT or SELECT, while you have a classical UPSERT
problem: INSERT or UPDATE:
FOR UPDATE
like you considered in the comments is not the silver bullet here. UPDATE user_metrics ...
locks the row it updates anyway. The problematic case is when two INSERT
s try to create a row for a new user_id
concurrently. You cannot lock key values that are not present in the unique index, yet, in Postgres. FOR UPDATE
can't help. You need to prepare for a possible unique violation and retry as discussed in these linked answers:
Assuming these table definitions:
CREATE TABLE user_eval_history (
user_eval_id serial PRIMARY KEY
, user_id int NOT NULL
, is_good_eval boolean NOT NULL
);
CREATE TABLE user_metrics (
user_metrics_id -- seems useless
, user_id int PRIMARY KEY
, nb_good_eval int NOT NULL DEFAULT 0
, nb_bad_eval int NOT NULL DEFAULT 0
);
First, you need a trigger function before you can create a trigger.
CREATE OR REPLACE FUNCTION trg_user_eval_history_upaft()
RETURNS trigger
LANGUAGE plpgsql AS
$func$
BEGIN
LOOP
IF NEW.is_good_eval THEN
UPDATE user_metrics
SET nb_good_eval = nb_good_eval + 1
WHERE user_id = NEW.user_id;
ELSE
UPDATE user_metrics
SET nb_bad_eval = nb_bad_eval + 1
WHERE user_id = NEW.user_id;
END IF;
EXIT WHEN FOUND;
BEGIN -- enter block with exception handling
IF NEW.is_good_eval THEN
INSERT INTO user_metrics (user_id, nb_good_eval)
VALUES (NEW.user_id, 1);
ELSE
INSERT INTO user_metrics (user_id, nb_bad_eval)
VALUES (NEW.user_id, 1);
END IF;
RETURN NULL; -- returns from function, NULL for AFTER trigger
EXCEPTION WHEN UNIQUE_VIOLATION THEN -- user_metrics.user_id is UNIQUE
RAISE NOTICE 'It actually happened!'; -- hardly ever happens
END;
END LOOP;
RETURN NULL; -- NULL for AFTER trigger
END
$func$;
In particular, you don't pass user_id
as parameter to the trigger function. The special variable NEW
holds values of the triggering row automatically. Details in the manual here.
Trigger:
CREATE TRIGGER upaft_update_user_metrics
AFTER INSERT ON user_eval_history
FOR EACH ROW EXECUTE PROCEDURE trg_user_eval_history_upaft();
Upvotes: 3