Reputation: 329
I have a dataset with diagnosis records, where a patient can have one or more records even for same code. I am unable to use group by variable 'code' since it shows error similar as The ID value "code_v58" occurs twice in the same BY group.
data have;
input id rand found code $;
datalines;
1 101 1 001
2 102 1 v58
2 103 0 v58 /* second diagnosis record for patient 2 */
3 104 1 v58
4 105 1 003
4 106 1 003 /* second diagnosis record for patient 4 */
5 107 0 v58
;
Desired output:
Obs id code_001 code_v58 code_003
1 1 1 . .
2 2 . 1 . /* second diagnosis code's {v58} status for patient 2 is 1, so it has to be taken*/
3 3 . 1 .
4 4 . . 1
5 5 . 0 .
When I tried with let statement like [this],
proc transpose data=temp out=want(drop=_name_) prefix=code_ let;
by id;
id code; * column name becomes <prefix><code>;
var found;
run;
I got output as below:
Obs id code_001 code_v58 code_003
1 1 1 . .
2 2 . 0 .
3 3 . 1 .
4 4 . . 1
5 5 . 0 .
I tried this and modified PROC TRANSPOSE to use ID and count in the BY statement
proc transpose data=temp out=want(drop=_name_) prefix=code_;
by id count;
id code; * column name becomes <prefix><code>;
var found;
run;
and got output like below:
Obs id count code_001 code_v58 code_003
1 1 1 1 . .
2 2 1 . 1 .
3 2 2 . 0 .
4 3 1 . 1 .
5 4 1 . . 1
6 4 2 . . 1
7 5 1 . 0 .
May I know how to remove duplicate patient ids and update the code to 1 if found in any records?
Upvotes: 1
Views: 494
Reputation: 27508
You can transpose
a group aggregate view
.
proc sql;
create view have_v as
select id, code, max(found) as found
from have
group by id, code
order by id, code
;
proc transpose data=have_v out=want prefix=code_;
by id;
id code;
var found;
run;
Follow up with Proc STDIZE
(thanks @Reeza) if you want to replace the missing values (.
) with 0
proc stdize data=want out=want missing=0 reponly;
var code_:;
run;
Upvotes: 2
Reputation: 63434
Seems to me that you want something like this - first preprocess the data to get the value you want for FOUND, then transpose (if you actually need to). The TABULATE does what it seems like you want to do for FOUND (take the max value of it, 1 if present, 0 if only 0s are present, missing otherwise), and then TRANSPOSE that the same way you were doing before.
proc tabulate data=have out=tab;
class id code;
var found;
tables id,code*found*max;
run;
proc transpose data=tab out=want prefix=code_;
by id;
id code;
var found_max;
run;
Upvotes: 2