Jont
Jont

Reputation: 37

In SAS, how do get distinct counts of a variable that has multiple observations for each individual?

A simple question, but I have three variables in my dataset: ID, ICDCode, and a Visit Date. There are multiple occurrences of each ICDCode per person(ID). How do I obtain a total, distinct count of icdcodes for the entire dataset, without counting a certain ICDCode twice for an individual? For example, I want to say that there are 100 cases of Heart Disease in my dataset (without counting heart disease 10 times for the same person).
Below is code I have tried:

    proc freq data= cases;
    table ICDCode;
    run;

    proc sql;
    select ICDCode,
    count(*) as Frequency
    from cases
    group by ID;
    quit;
    

Upvotes: 0

Views: 3508

Answers (3)

Jeremy Miller
Jeremy Miller

Reputation: 146

First sort with nodupkey to restrict to one copy of each observed ID/ICDcode combination, then run a simple frequency table.

proc sort data=cases out=want nodupkey;
   by id icdcode;

proc freq data=want;
   tables icdcode;
run;

Upvotes: 1

Richard
Richard

Reputation: 27516

Count the distinct patient ids when grouping by icd code.

Example:

data have;
  call streaminit(123);
  do patid = 1 to 100;
    do dxseq = 1 to 10;
      if rand('uniform') < 0.25 or dxseq = 1 then 
        code = '429.9';  /* hey Oprah, everybody gets a heart disease!; */
      else
        code = put(428 + round(3*rand('uniform'),0.1), 5.1);
      output;
    end;
  end;
run;

proc sql;
  create table codefreq as
    select code, count(distinct patid) as pat_count
    from have
    group by code;

enter image description here

Upvotes: 1

PeterClemmensen
PeterClemmensen

Reputation: 4937

How about simply: (Given that 429.9 represent heart disease)

data cases;
input ID ICDCode;
datalines;
1 429.9
1 429.9
1 2    
1 3    
3 429.9
3 429.9
3 3    
2 1    
2 2    
;

proc sql;
   select count(distinct ID) as n
   from cases
   where ICDCode = 429.9;
run;

Upvotes: 3

Related Questions