Reputation: 121
I have a PostgreSQL database with 4 tables - Table A, B, C and D. Table A has three columns which are IDs from the other three tables.
Something like this:
TABLE A
--------------------------
| id | B_id | C_id | D_id |
---------------------------
I already have several thousand rows of data in tables B, C, D, and now want to generate an insert script for table A, by selecting data from the other 3 based on some conditions.
I tried as follows:
INSERT INTO A (B_id, C_id, D_id) VALUES ((SELECT id FROM B WHERE CONDITION), (SELECT id FROM C WHERE CONDITION), (SELECT id FROM D WHERE CONDITION)),((SELECT id FROM B WHERE CONDITION), (SELECT id FROM C WHERE CONDITION), (SELECT id FROM D WHERE CONDITION)),((SELECT id FROM B WHERE CONDITION), (SELECT id FROM C WHERE CONDITION), (SELECT id FROM D WHERE CONDITION)), ....;
However, with a big amount of rows this takes ages and then fails.
I am wondering if I'm doing this right and if not, what would be the most efficient way of achieving what I want.
More detailed info and example data
Accounts
---------------------------------------------
| id | firstName | lastName | email (unique)|
---------------------------------------------
1 Account One [email protected]
Groups
----------------------
| id | name (unique) |
----------------------
1 Group One
Titles
----------------------
| id | name (unique) |
----------------------
1 Title One
Table A then contains an id from each of these as mentioned:
A
------------------------------------
| id | accountId | groupId | titleId
------------------------------------
1 1 1 1
For each account (each unique email), I programmatically generate an insert statement
INSERT INTO A (accountId, groupId, titleId)
VALUES (
(SELECT id FROM accounts WHERE email = {EMAIL}),
(SELECT id FROM groups WHERE name = {groupName}),
(SELECT id FROM titles WHERE name = {titleName}))
As mentioned there will be several thousand (aprox 15000) insert statements generated. The condition values are programmatically added from some parsed data in my script.
The select statements will always return exactly one row because where conditions are done on columns with unique constraints. Therefore each account email should have exactly one combination of account + group + title in table A.
Upvotes: 2
Views: 744
Reputation: 15614
Create single temporary table for all {EMAIL}, {groupName}, {titleName}
triplets and fill it. Lets say it would be:
create temp table filler (
email text,
group_name text,
title_name text);
insert into filler values
('email1', 'group1', 'title1'),
('email2', 'group2', 'title2'),
....
('emailn', 'groupn', 'titlen');
Or use copy
.
Then insert values in single statement using it:
insert into A (accountId, groupId, titleId)
select a.id, g.id, t.id
from
filler as f
left join accounts as a on (f.email = a.email)
left join groups as g on (f.group_name = g.name)
left join titles as t on (f.title_name = t.name);
PS: Indexes on accounts.email
, groups.name
and titles.name
could be useful.
Upvotes: 1
Reputation: 1269463
Your values expression is set up to insert exactly one row. Each subquery is a scalar subquery that can return at most one row. The overall query will fail, if any of them return more than one row.
Perhaps you want a Cartesian Product of the values returned by all subqueries. If so, use insert . . . select
:
INSERT INTO A (B_id, C_id, D_id)
SELECT b.id, c.id, d.id
FROM (SELECT id FROM B WHERE CONDITION) b CROSS JOIN
(SELECT id FROM C WHERE CONDITION) c CROSS JOIN
(SELECT id FROM D WHERE CONDITION) d;
Run the SELECT
first to be sure it generates the rows you intend to insert.
Upvotes: 0