Insert data into table effeciently, postgresql

Question

I am new to postgresql (and databases in general) and was hoping to get some pointers on improving the efficiency of the following statement.

I am inserting data from one table to another, and do not want to insert duplicate values. I have a rid (unique identifier in each table) that are indexed and are Primary Keys.

I am currently using the following statement:

INSERT INTO table1 SELECT * FROM table2 WHERE rid NOT IN (SELECT rid FROM table1).

As of now the table one is 200,000 records, table2 is 20,000 records. Table1 is going to keep growing (probably to around 2,000,000) and table2 will stay around 20,000 records. As of now the statement takes about 15 minutes to run. I am concerned that as Table1 grows this is going to take way to long. Any suggestions?

Sam Choukri · Accepted Answer

This should be more efficient than your current query:

INSERT INTO table1
SELECT * 
FROM table2
WHERE NOT EXISTS (
  SELECT 1 FROM table1 WHERE table1.rid = table2.rid
);

Insert data into table effeciently, postgresql

Answers (2)

Related Questions