DJJ
DJJ

Reputation: 2549

transpose efficiently with proc sql

i would like to know if it is possible to transpose efficiently from wide to long using proc sql in sas.

I'm aware that proc transpose is much quicker that the method i suggest below. But one of my objective would be to avoid storing the transposed table.

Let's say for example, that i have table1 as

Id|   A|   B|   C|  D    
_____________________
 1|  100|3500|6900| 10300
 2|  200| 250| 300| 350
 3|  150|  32| 400| 204
 4|  200| 800|1400| 2000

and i want to turn it into

id|col1|  col2|
______________
 1|   A|   100|
 1|   B|  3500|
 1|   C|  6900|
 1|   D| 10300|
 2|   A|   200|
 2|   B|   250|
 2|   C|   300|
 2|   D|   350|
 3|   A|   150|
 3|   B|    32|
 3|   C|   400|
 3|   D|   204|
 4|   A|   200|
 4|   B|   800|
 4|   C|  1400|
 4|   D|  2000|

I could do this;

select id, 'A' as col1, A as col2
from table1
where A ~=""
union select id, 'B' as col1, B as col2
from table1
where B ~=""
etc

but it is highly inefficient.

Any idea? Thanks.

Upvotes: 6

Views: 17358

Answers (2)

Joe
Joe

Reputation: 63434

If you're in SAS, use PROC TRANSPOSE for this option. There is no particularly good way to do this in PROC SQL; while many SQL variants have their own way to pivot data, SAS has PROC TRANSPOSE and expects you to use it.

The SAS datastep also does this very efficiently, perhaps even better than PROC TRANSPOSE. Here's an example, including creating a view as noted in the comments.

data want/view=want;
set have;
array vars a b c d;                  *array of your columns to transpose;
do _t = 1 to dim(vars);              *iterate over the array (dim(vars) gives # of elements);
  if not missing(vars[_t]) then do;  *if the current array element's value is nonmissing;
    col1=vname(vars[_t]);            *then store the variable name from that array element in a var;
    col2=vars[_t];                   *and store the value from that array element in another var;
    output;                          *and finally output that as a new row;
  end;
end;
drop a b c d _t;                     *Drop the old vars (cols) and the dummy variable _t;
run;

Upvotes: 11

Skyler
Skyler

Reputation: 170

I actually did something just like this today. Try doing this,

proc transpose data = ORIGINAL_DATA;
        out = NEW_DATA;
    by id;
    VAR A-D;
run;

I think this should work.

Upvotes: 2

Related Questions