SQL - relying on server errors during INSERT

Question

I'm working with PostgreSQL 9.1. Let's say I have a table where some columns have UNIQUE constraint. The easiest example:

CREATE TABLE test (
    value INTEGER NOT NULL UNIQUE
);

Now, when inserting some values, I have to separately handle the case, where the values to be inserted are already in the table. I have two options:

Make a SELECT beforehand to ensure the values are not in the table, or:
Execute the INSERT and watch for any errors the server might return.

The application utilizing the PostgreSQL database is written in Ruby. Here's how I would code the second option:

require 'pg'

db = PG.connect(...)

begin
    db.exec('INSERT INTO test VALUES (66)')
rescue PG::UniqueViolation
    # ... the values are already in the table
else
    # ... the values were brand new
end

db.close

Here's my thinking: let's suppose we make a SELECT first, before inserting. The SQL engine would have to scan the rows and return any matching tuples. If there are none, we make an INSERT, which presumably makes yet another scan, to see if the UNIQUE constraint is not about to be violated by any chance. So, in theory, second option would speed the execution up by 50%. Is this how PostgreSQL would actually behave?

We're assuming there's no ambiguity when it comes to the exception itself (e.g. we only have one UNIQUE constraint).

Is it a common practice? Or are there any caveats to it? Are there any more alternatives?

Tometzky · Accepted Answer

It depends - if your application UI normally allows entering duplicate values, then it's strongly encouraged to check before inserting. Because any error would invalidate current transaction, consume sequence/serial values, fill logs with error messages etc.

But if your UI is not allowing duplicates, and inserting duplicate is only possible when somebody is using tricks (for example during vulnerability research) or highly improbable then I'd allow inserting without checking first.

As unique constraint forces creation of an index, this check is not slow. But definitely slightly slower than inserting and checking for rare errors. Postgres 9.5 would have on conflict do nothing support, which would be both fast and safe. You'd check number of rows inserted to detect duplicates.

SQL - relying on server errors during INSERT

Answers (2)

Related Questions