stkvtflw
stkvtflw

Reputation: 13507

Postgres: best solution to ignore errors during bulk insert

I've seen bunch of similar questions here. But my question, i guess, specific.

This is best solution i've found for current version of Postgres (9.4): http://www.depesz.com/2012/06/10/why-is-upsert-so-complicated/ http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE

Generally, the suggested solution is based on algorithm: insert one row, if error happened - do something, if not - insert next one.

If all i need is to ignore duplicate errors, can I just do this?

INSERT INTO tablename (id,name,surname) values (1,'john','doe')
INSERT INTO tablename (id,name,surname) values (2,'jane','smith')

instead of this:

INSERT INTO tablename (id,name,surname) values (1,'john','doe'),(2,'jane','smith')

...if I have to insert only ~5-30 rows at once? so, some of inserted rows just return duplicate errors, but the rest of them will be executed successfully. This is all I need, actually.

I've tried to compare the cost of those approaches by EXPLAIN INSERT 100 rows at once and 100 rows separately. Probably, I'm doing something wrong, because it shows ~25-50 times lesser execution time, when I'm inserting rows separately, like in first example:

INSERT INTO tablename (id,name,surname) values (1,'john','doe')
INSERT INTO tablename (id,name,surname) values (2,'jane','smith')

Here is query wrapper i was using:

BEGIN;
EXPLAIN ANALYZE
-- query or queriES here
ROLLBACK;

So, the question is, why am I receiving this? May be, EXPLAIN shows execution time for each row instead of the whole query? Then it makes sense: in this case, bulk insert will cost ~3 times less than separated commands. Right?

Upvotes: 5

Views: 2198

Answers (1)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125214

with i (id, name, surname) as (values (1,'john','doe'),(2,'jane','smith'))
insert into t (id, name, surname)
select id, name, surname
from i
where not exists (
   select 1
   from t
   where id = i.id and name = i.name and surname = i.surname
)

There is a race condition above if data can be inserted in any other way. Just retry if an error occurs.

Upvotes: 2

Related Questions