Reputation: 145
I'm making a dictionary of names with gender asociated so I have a main table lets say:
**name_dict a**
name gender
=======================
jhon male
jane female
anna female
and a source data table, which has "duplicates", I mean, the same name, with different gender like this:
**name_source b**
name gender
=======================
cameron male
cameron female
anna female
travis male
I would like to merge this two tables with this conditions
How would I create my merge so I get this result?
name gender
----------------
jhon male
jane female
anna female
travis male
I really appreciate all your help and suggestions!
Edit------------------------------------------------------------------------ so, here is what i created with my inspiration
merge into name_dictionary x using(
select a.name,a.gender from name_source a, (select name,count(1) from name_source group by name having count(1)>1 order by count(1)) b
where a.name=b.name
) y
on (x.name=y.name)
when not matched then
insert (name,gender)
values (y.name,y.gender)
Then i said, lets test it against our friend Thomas Tschernich's sugestion, for that i used:
insert into name_dictionary
select name,gender
from name_source t1
where
(t1.name, t1.gender) not in (
select name, gender from name_dictionary
)
and
(t1.name, t1.gender) not in (
select t2.name, t2.gender
from name_source t2
join name_source t3 on (t2.name = t3.name and t2.gender != t3.gender)
);
then runned both against each other and got:
QUERY EXEC TIME FINAL ROWS PLAN DATA
merge 2 secs 96,070 MERGE STATEMENT ALL_ROWS Cost: 253 Bytes: 46,752 Cardinality: 974
c-Insert killed (31m) ¿? INSERT STATEMENT ALL_ROWS Cost: 24,656,135 Bytes: 1,051,700 Cardinality: 105,170
This is the info on the tables i used:
Table Initial Rows Observations
name_dictionary 3,097 The ones already inserted
name_source 101,205 The ones i want to filter and add to the name_dictionary
(couldn't format it right, hope its readable) anyways, i hope you can elaborate if its right or i missed something, Thanks a lot!!!
---new findings if i remove the order by in the merge the cost goes up to 298;
Upvotes: 0
Views: 1275
Reputation: 1282
It is probably easier to use two separate inserts than a merge. First, insert all the entries from table a, like this:
insert into name_new select * from name_dict
Then, do a conditional insert for your second table, like this:
insert into name_new
select *
from name_source t1
where
(t1.name, t1.gender) not in (
select name, gender from name_new
)
and
(t1.name, t1.gender) not in (
select t2.name, t2.gender
from name_source t2
join name_source t3 on (t2.name = t3.name and t2.gender != t3.gender)
)
The first where
section sorts out the anna-case, the second one will sort out the duplicates with both genders.
Upvotes: 1