Reputation: 1755
I have two tables with the same columns, the first column is the name and the second is a count. I would like to merge these tables, so that each name appears with the added count of the two tables:
Table1: Table2: Result Table:
NAME COUNT NAME COUNT NAME COUNT
name1 1 name3 3 name1 1
name2 2 name4 4 name2 2
name3 3 name5 5 name3 6
name4 4 name6 6 name4 8
name5 5
name6 6
As of the moment I have created a pretty ugly structure to execute this, and would like to know if it is possible to get the results in a more elegant way.
What I have so far (Table1 is test1 and Table2 is test2):
create table test1 ( name varchar(40), count integer);
create table test2 ( name varchar(40), count integer);
create table test3 ( name varchar(40), count integer);
create table test4 ( name varchar(40), count integer);
create table test5 ( name varchar(40), count integer);
insert into test4 (name, count) select * from test1;
insert into test4 (name, count) select * from test2;
insert into test3 (name , count) select t1.name, t1.count + t2.count
from test1 t1 inner join test2 t2 on t1.name = t2.name;
select merge_db(name, count) from test3;
insert into test5 (name, count) (select name, max(count) from test4 group by name);
CREATE FUNCTION merge_db(key varchar(40), data integer) RETURNS VOID AS
$$ -- souce: http://stackoverflow.com/questions/1109061/insert-on-duplicate-update-postgresql
BEGIN
LOOP
-- first try to update the key
UPDATE test4 SET count = data WHERE name = key;
IF found THEN
RETURN;
END IF;-- not there, so try to insert the key -- if someone else inserts the same key concurrently, -- we could get a unique-key failure
BEGIN
INSERT INTO test4(name,count) VALUES (key, data);
RETURN;
EXCEPTION WHEN unique_violation THEN-- do nothing, and loop to try the UPDATE again
END;
END LOOP;
END;
$$
LANGUAGE plpgsql;
Upvotes: 12
Views: 23942
Reputation: 8098
=> create table t1 (name text,cnt int);
=> create table t2 (name text,cnt int);
=> insert into t1 values ('name1',1), ('name2',2), ('name3',3), ('name4',4);
=> insert into t2 values ('name3',3), ('name4',4), ('name5',5), ('name6',6);
=> select name,sum(cnt) from
(select * from t1
union all
select * from t2 ) X
group by name
order by 1;
name | sum
-------+-----
name1 | 1
name2 | 2
name3 | 6
name4 | 8
name5 | 5
name6 | 6
(6 rows)
Upvotes: 17
Reputation: 521
An alternative method is to use the NATURAL FULL OUTER JOIN combined with SUM(count) and GROUP BY name statements. The following SQL code exactly yields the desired result:
SELECT name, SUM(count) AS count FROM
( SELECT 1 AS tableid, * FROM t1 ) AS table1
NATURAL FULL OUTER JOIN
( SELECT 2 AS tableid, * FROM t2 ) AS table2
GROUP BY name ORDER BY name
The artificial tableid column ensures that the NATURAL FULL OUTER JOIN creates a separate row for each row in t1 and for each row in t2. In other words, the rows "name3, 3" and "name4, 4" appear twice in the intermediate result. In order to merge these duplicate rows and to sum the counts we can group the rows by the name column and sum the count column.
Upvotes: 0
Reputation: 16722
How about this, in pure SQL:
SELECT
COALESCE(t1.name, t2.name),
COALESCE(t1.count, 0) + COALESCE(t2.count, 0) AS count
FROM t1 FULL OUTER JOIN t2 ON t1.name=t2.name;
Basically we're doing a full outer join on the name field to merge the two tables. The tricky part is that with the full outer join, rows that exist in one table but not the other will appear, but will have NULL in the other table; so if t1 has "name1" but t2 doesn't, the join will give us NULLs for t2.name and t2.name.
The COALESCE function returns the first non-NULL argument, so we use it to "convert" the NULL counts to 0 and to pick the name from the correct table. Thanks for the tip on this Wayne!
Good luck!
Upvotes: 9