Reputation: 35
I have seen some questions related to my issue, but never exactly the same structure, and as I am quite new to SAS/SQL I can't find a proper way out of my problem.
I am trying for a couple of days to join multiple tables (about thirty)
Basically, my problem boils down to this:
Let say I have 3 tables:
TAB, with 2 variables V1 and V2 t1, with 2 variables V2 and V3 t2, with the same 2 variables V2 and V3
I want to join TAB with t1 or t2 depending on the value of V2 to obtain the Full_TABLE
TAB
V1 V2
A 1
B 1
C 2
D 2
t1
V2 V3
1 x
1 y
1 z
t2
V2 V3
2 h
2 i
2 j
intended result:
Full_TABLE
V1 V2 V3
A 1 x
A 1 y
A 1 z
B 1 x
B 1 y
B 1 z
C 2 h
C 2 i
C 2 j
D 2 h
D 2 i
D 2 j
Logically, it is a conditional join based on the value of V2:
if V2=1 then merge TAB with t1
if V2=2 then merge TAB with t2
I don't know if it is possible to do it automatically,
To give a rough idea, V1 has 30.000 different values, V2 has 27 (so 27 tables t1-t27), V3 has 10 values per value of V2, thus I expect a Full_TABLE of 30.000*10=300.000 rows
So I can manage a semi-automated solution based on V2 and/or V3, but not V1
Any leads in SAS or SQL (or proc sql...) highly appreciated !
S
Upvotes: 2
Views: 850
Reputation: 51566
It looks to me like T1 and T2 are the same table that was split into two based on the value of V2. Why not just UNION them and then join?
proc sql ;
create table want as
select a.*,b.v3
from tab a
, (select * from t1 union select * from t2) b
where a.v2 = b.v2
;
quit;
Upvotes: 0
Reputation: 63424
The data step solution is interesting:
data want;
set tab;
if v2=1 then do;
do _n_ = 1 to nobs_t1;
set t1 point=_n_ nobs=nobs_t1;
output;
end;
end;
else if v2=2 then do;
do _n_=1 to nobs_t2;
set t2 point=_n_ nobs=nobs_t2;
output;
end;
end;
run;
That's the standard "cartesian join in the data step", just done twice, with an if telling SAS which to do.
Basically, you use point
to iterate through either t1 or t2 depending on the value of v2. This could be extended through the macro language if needed to allow for many tables/variable values (similarly to how you would extend the sql solution).
Upvotes: 1
Reputation: 1269763
You can do this with proc sql
and some SQL cleverness:
proc sql;
select tab.v1, tab.v2, coalesce(t1.v3, t2.v3) as v3
from tab left join
t1
on tab.v2 = 1 left join
t2
on tab.v2 = 2;
Upvotes: 3