SAS: Multiple patient diagnoses on multiple lines

Question

I have a dataset of patient diagnoses with one diagnosis code per line, resulting in patient diagnoses on multiple lines. Each patient has a unique patientID. I also have age, race, gender, etc. data on these patients.

How do I indicate to SAS when using PROC FREQ, Logistic, Univariate, etc. that they are the same patient?

This is an example of what the data looks like:

patientID diagnosis age gender  lab
1         15.02     65    M      positive
1         250.2     65    M      positive
2         348.2     23    M      negative
2         282.1     23    M      negative
3                   50    F      positive

I was given data on every patient who has had a certain lab (regardless of positive result), as well as all of their diagnoses, which each appear on a different line (as a different observation to SAS). First, I will need to exclude every patient who has a negative result for the lab, which I plan on using an IF statement for. The lab determines if the patient has disease X. Some patients do not have any additional diseases, other than disease X, such as patient #3.

Analyses I would like to perform:

Calculate the frequency of each disease using PROC FREQ.
Characterize the age and race relationships for each diagnosis using PROC FREQ chi square.
PROC Logistic to determine risk factors (age, race, gender, etc.)for developing an additional disease on top of disease X.

Thanks!

Reeza · Accepted Answer

The answer to your question is you cannot by default. But when you're processing the data you can account for it easily. IMO keeping it long is easier.

You've asked too many questions above so I'll answer just one, how to count the number of people with disease x.

Proc sort data = have out = unique_disease_patient nodupkey;
 By patientID Diag;
Run;


Proc freq data = unique_disease_patient noprint;
Table disease  / out = disease_patient_count;
Run;

Note that this is much easier in SQL

 Proc sql;
 Create table want as
 Select diag, count(distinct patientID) 
 From have
 Group by diag;
 Quit;

I'm assuming this is homework because you're unlikely to do this in practice except for exploratory analysis.

SAS: Multiple patient diagnoses on multiple lines

Answers (1)

Related Questions