sweta
sweta

Reputation: 1

SAS SQL query to solve, adding rows to distinct variable

Need help to sort out this scenario,

Table

pr_id    lob_id  prec1     prec2   prec3
112      1a      3478      56      77
112      1b      3466      65      43
112      1c      5677      57      68
112      1d      5634      49      52
215      2a      1234      43      45
215      2b      9787      32      43
215      2c      4566      39      90
388      3a      8797      88      99
388      3b      6579      58      72
388      3c      9087      76      67

Required output : need distinct observations in pr_id and respective lob_id observation rows to the distinct pr_id row. As shown below

pr_id  lob_id  prec1   prec2   prec3   lob_id  prec1   prec2  prec3  lob_id  prec1   prec2  prec3 lob_id  prec1  prec2 prec3
112    1a      3478    56      77      1b      3466    65     43     1c      5677    57     68    1d      5634   49    52
215    2a      1234    43      45      2b      9787    32     43     2c      4566    39     90    .       .      .     .
388    3a      8797    88      99      3b      6579    58     72     3c      9087    76     67    .       .      .     .

I have tried doing it with proc transpose, but the variable names are differ from required output, could you please help me in this.

Thank you.

Upvotes: 0

Views: 292

Answers (1)

Joe
Joe

Reputation: 63434

This will do as close as you can get to your desired answer. It's far more convoluted than is probably needed, but it does ensure the lob_id's stay with their prec1-3's. You cannot have the same variable name for multiple variables, but you can have the same label, so I keep the label the same while adding _1 _2 _3 etc.

You could then PROC PRINT the dataset, if you want this in the output window (and that should show the label, thus getting your desired repeated variable names in the output).

data have;
input pr_id    lob_id  $ prec1     prec2   prec3;
datalines;
112      1a      3478      56      77
112      1b      3466      65      43
112      1c      5677      57      68
112      1d      5634      49      52
215      2a      1234      43      45
215      2b      9787      32      43
215      2c      4566      39      90
388      3a      8797      88      99
388      3b      6579      58      72
388      3c      9087      76      67
;;;;
run;
data have_pret;
set have;
by pr_id;
array precs prec:;
if first.pr_id then counter=0;
counter+1;
varnamecounter+1;
valuet=lob_id;
idname=cats("lob_id",'_',counter);
idlabel="lob_id";
output;
call missing(valuet);
do __t = 1 to dim(precs);
  varnamecounter+1;
  valuen=precs[__t];
  idname=cats('prec',__t,'_',counter);
  idlabel=vlabel(precs[__t]);
  output;
end;
call missing(valuen);
keep pr_id valuet valuen idname idlabel varnamecounter;
run;

proc sort data=have_pret out=varcounter(keep=idname varnamecounter);
by idname varnamecounter;
run;

data varcounter_fin;
set varcounter;
by idname varnamecounter;
if first.idname;
run;

proc sql;
select idname into :varlist separated by ' ' 
 from varcounter_fin order by varnamecounter;
quit;


proc transpose data=have_pret(where=(not missing(valuen))) out=want_n;
by pr_id;
var valuen;
id idname;
idlabel idlabel;
run;

proc transpose data=have_pret(where=(missing(valuen))) out=want_t;
by pr_id;
var valuet;
id idname;
idlabel idlabel;
run;

data want;
retain pr_id &varlist.;
merge want_n want_t;
by pr_id;
drop _name_;
run;

To do this in SQL is irritating; SAS doesn't support the advanced SQL table functions that would permit you to transpose it neatly without hardcoding everything. It would be something like

proc sql;
select pr_id, 
max(lob_id1) as lob_id1, max(prec1_1) as prec1_1, max(prec2_1) as prec2_1, max(prec3_1) as prec3_1,
max(lob_id2) as lob_id2, max(prec1_2) as prec1_2, max(prec2_2) as prec2_2, max(prec3_2) as prec3_2 from (
select pr_id, 
case when substr(lob_id,2,1)='a' then lob_id else ' ' end as lob_id1, 
case when substr(lob_id,2,1)='a' then prec1 else . end as prec1_1, 
case when substr(lob_id,2,1)='a' then prec2  else . end as prec2_1, 
case when substr(lob_id,2,1)='a' then prec3 else . end as prec3_1, 
case when substr(lob_id,2,1)='b' then lob_id else ' ' end as lob_id2, 
case when substr(lob_id,2,1)='b' then prec1 else . end as prec1_2, 
case when substr(lob_id,2,1)='b' then prec2  else . end as prec2_2, 
case when substr(lob_id,2,1)='b' then prec3 else . end as prec3_2
from have )
group by pr_id;
quit;

but extended to include 3 and 4. You can see why it's silly to do this in SQL I hope :) The SAS code is probably actually shorter, and is doing far more work to make this easily extendable - you could skip half of it if you just hardcoded that retain statement, for example.

Upvotes: 1

Related Questions