Reputation: 329
Suppose i have random diagnostic codes, such as 001, v58, ..., 142,.. How can I construct columns from the codes which is 1 for the records?
Input:
id found code
1 1 001
2 0 v58
3 1 v58
4 1 003
5 0 v58
......
......
15000 0 v58
Output:
id code_001 code_v58 code_003 .......
1 1 0 0
2 0 0 0
3 0 1 0
4 1 0 0
5 0 0 0
.........
.........
Upvotes: 3
Views: 221
Reputation: 27498
You will want to TRANSPOSE
the values and name the pivoted columns according to data (value of code
) with an ID
statement.
Example:
In real world data it is often the case that missing diagnoses will be flagged zero, and that has to be done in a subsequent step.
data have;
input id found code $;
datalines;
1 1 001
2 0 v58
2 1 003 /* second diagnosis result for patient 2 */
3 1 v58
4 1 003
5 0 v58
;
proc transpose data=have out=want(drop=_name_) prefix=code_;
by id;
id code; * column name becomes <prefix><code>;
var found;
run;
* missing occurs when an id was not diagnosed with a code;
* if that means the flag should be zero (for logistic modeling perhaps)
* the missings need to be changed to zeroes;
data want;
set want;
array codes code_:;
do _n_ = 1 to dim(codes); /* repurpose automatic variable _n_ for loop index */
if missing(codes(_n_)) then codes(_n_) = 0;
end;
run;
Upvotes: 3