Reputation: 121
I have 2 tables in PostgreSQL:-
student_id name score
1 Adam 10
2 Brian 9
log_id student_id score
1 1 10
2 2 9
I have a python script which fetches a DataFrame with columns - "name" and "score" and then populates it to the student table.
I want to update the student and student_log table whenever the "score" changes for a student. Also, if there is a new student name in the dataframe, I want to add another row for it in the student table as well as maintain its record in the "student_log" table. Can anyone suggest how it can be done?
Let us consider the new fetched DataFrame looks like this:-
name score
Adam 7
Lee 5
Then the Expected Result is:-
"student" table
student_id name score
1 Adam 7
2 Brian 9
3 Lee 5
"student_log" table:-
log_id student_id score
1 1 10
2 2 9
3 1 7
4 3 5
Upvotes: 1
Views: 249
Reputation: 121
I finally found a good answer. I used trigger, function and CTE.
I create a function to log changes along with a trigger to handle the updates. Following is the code.
CREATE OR REPLACE FUNCTION log_last_changes()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS
$$
DECLARE
serial_num integer;
BEGIN
IF NEW.name <> OLD.name OR NEW.score <> OLD.score
THEN
SELECT SETVAL('log_id_seq', (select max(id) from log)) into serial_num;
INSERT INTO log(student_id,score)
VALUES(NEW.id,NEW.score)
ON CONFLICT DO NOTHING;
END IF;
RETURN NEW;
END;
$$;
CREATE TRIGGER log_student
AFTER UPDATE
ON student
FOR EACH ROW
EXECUTE PROCEDURE log_last_changes();
THE CTE expression is as follow:-
WITH new_values(id, name, score) AS (
values
(1,'Adam',7),
(2,'Brian',9),
(3,'Lee',5)
),
upsert AS
(
UPDATE student s
SET NAME = nv.name,
SCORE = nv.score
FROM new_values nv, student s2
WHERE
s.id = nv.id and s.id = s2.id
Returning s.*
)
INSERT INTO student select id, name, score
FROM
new_values
WHERE NOT EXISTS (
SELECT 1 from upsert up where up.id=new_values.id
);
Upvotes: 1
Reputation: 1775
I guess you try to diff two dataframe
here is a example
#old student dataframe
old_pd:pd.DataFrame
#new student dataframe
new_pd:pd.DataFrame
joined_pd = new_pd.join(old_pd,on='name',lsuffix='_new',rsuffix='_old')
diff_pd = joined_pd[joined_pd['score_new']!=joined_pd['score_old']]
#then insert all diff_pd to student_log table.and update to student table
Upvotes: 0