asandri
asandri

Reputation: 3

calculate age-specific person years in SAS

My dataset is a cohort of study participants (one row per participant) for whom I have date of birth (dob), start date (sdate) and end/stop date (edate) (sdate and edate refer to the period of participation in the study). The participants can be organised into age groups based on their age at startdate (ageatstart), and the total person years each person is contributing to the study can be calculated (py). What I would like to do now is distribute the 'py' across the different age groups, since each person is aging (and possibly changing age groups during the study).

For example, if I define my age groups as <30, [30,39], >39, the first participant (person1 who contributes a total of 10 py as the SAS code shows below) should contribute roughly 5 years to the age group <30 and 5 more years to the age group [30,39].

Ideally, I would like to have a set of variables created (e.g. pyinagegroup1, pyinagegroup2, pyinagegroup3) that would capture the time contributed by each person to each age group (in my example for person1: pyinagegroup1=5, pyinagegroup2=5, pyinagegroup3=0).

SAS code example:

data py1;
  input dob :ddmmyy10. sdate :ddmmyy10. edate :ddmmyy10. id ageatstart ageatend py ;
  format dob ddmmyy10. sdate ddmmyy10. edate ddmmyy10.;
datalines;
05/03/1980 01/01/2005 31/12/2014 1 24 34 10.0 
12/08/2006 12/08/2006 31/12/2014 2 0 8 8.39 
19/09/1975 01/01/2005 20/12/2011 3 29 35 6.38
;

run;

Upvotes: 0

Views: 495

Answers (1)

Tom
Tom

Reputation: 51611

Why not just expand the dataset to have one record per person per year?

This will just ignore the actual DOB and instead use your AGEATSTART variable and incrementing it by one for each year in the time period for that ID.

data py_expanded;
  set py1;
  do offset=0 to intck('year',sdate,edate);
    age=ageatstart+offset;
    sdate1 = max(sdate,intnx('year',sdate,offset,'b'));
    edate1 = min(edate,intnx('Year',sdate,offset,'e'));
    days = edate1-sdate1+1;
    py = days/(intnx('year',sdate,offset,'e')-intnx('year',sdate,offset,'b')+1);
    output;
  end;
  format sdate1 edate1 yymmdd10.;
run;

Now you can group the AGE into whatever categories you want and just sum the new PY variable.

PS Why is your PY value for ID=3 so low? It is only 11 days short of 7 years.

Upvotes: 1

Related Questions