Upsert (on conflict) vs Read before Write performance

Question

Everytime a logged in user visits one of my webpages, I add a row to a table of (user_id, page_id) to mark that this page was visited where (user_id, page_id) is the primary key in this table. Of course, if the user comes to this page again then the row already exists so a new one doesn't need to be added.

I currently use a postgres ON CONFLICT clause to first try writing and if there's a conflict then I do nothing since the row already exists. But I'm worried that because this write action is happening on every page visit that it's adding unnecessary load to the DB.

Is this thinking correct? If so, then instead of doing the upsert ON CONFLICT DO NOTHING, should I instead do a READ to check if this (user_id, page_id) already exists in the table, and if not, THEN only do the insert?

The benefit of the upsert ON CONFLICT is that it's just one single DB query. The second method of read before write would be 2 DB calls which would be slower since it has to go over the network. But the drawback of the first method is that it's doing a write on every page visit (or is it not actually considered a write since 99% of the time it will result in a conflict?).

Which way should I be following?

jjanes · Accepted Answer

You don't need a separate network round trip, you could package it up into one statement. However, in my hands the ON CONFLICT is slightly faster than the "pre-read" with a two-column key, and generates no extra IO. Tested using pgbench with the custom transactions (once the table is fully populated with all 1e6 rows):

\set a random(1, 1000)
\set b random(1, 1000)
insert into foo (a,b) values (:a,:b) on conflict do nothing;

versus

\set a random(1, 1000)
\set b random(1, 1000)
insert into foo (a,b) select :a,:b where not exists (select 1 from foo where a=:a and b=:b)

Plus the pre-read might be subject to race conditions.

Upsert (on conflict) vs Read before Write performance

Answers (1)

Related Questions