Reputation: 401
I am wondering of I can consolidate some code into one PROC SQL
statement instead of several back-to-back SQL
statements.
I am using macro to calculate the age of a participant at date of consent for a study, similar to what is provided here and here. I tried to troubleshoot the functionality of macros, and I calculated the total weight of a baby at birth in ounces using the %sum macro (which works great...). However, when trying to calculate age, the macros don't work.
However, if I use the macros in a new SQL
statement, it works fine.
The code below works:
%macro months(somedate,birth);
intck('month',&birth,&somedate)
- (day(&somedate) < day(&birth))
%mend months;
%macro days(somedate,birth);
intck('day',&birth,&somedate)
- (day(&somedate) < day(&birth))
%mend days;
%macro sum(part1, part2);
&part1*16 + &part2
%mend sum;
********** bringing in data from outside tables ;
proc sql;
create table demos as
select x.*, infcondt2 as c_dt,
y.*, datepart(visitdt) as v_dt format date9. ,
datepart(birthdt) as b_dt format date9. ,
birthweightlbs as lbs,
birthweightoz as oz,
lbs*16 + oz as tot_oz,
%sum(lbs,oz) as tot_oz_m
from enrolled as x,
demographics as y
where x.center = y.center and x.id = y.id ;
quit;
********** calculating age in months and in days ;
proc sql;
create table demos2 as
select * ,
%months(c_dt, b_dt) as age_m ,
%days(c_dt, b_dt) as age_d
from demos;
quit;
********** creating age groupings by months: 0-3 and 3-6 ;
proc sql;
create table demos3 as
select * ,
case
when age_m le 3 then 1
when age_m le 6 and age_m gt 3 then 2
else 3
end as age_interval
from demos2;
quit;
Is there any way to consolidate it into a single statement? Something like:
proc sql;
create table demos as
select x.*, infcondt2 as c_dt,
y.*, datepart(visitdt) as v_dt format date9. ,
datepart(birthdt) as b_dt format date9. ,
birthweightlbs as lbs,
birthweightoz as oz,
lbs*16 + oz as tot_oz,
%sum(lbs,oz) as tot_oz_m,
%months(c_dt, b_dt) as age_m,
%days(c_dt, b_dt) as age_d,
case
when age_m le 3 then 1
when age_m le 6 and age_m gt 3 then 2
else 3
end as age_interval
from enrolled as x,
demographics as y
where x.center = y.center and x.id = y.id ;
quit;
Upvotes: 2
Views: 1482
Reputation: 63424
If you want to use a field that was previously created in the same SQL statement, you need to use the calculated
keyword. IE:
proc sql;
select age*2 as double_Age, calculated double_age/2 as normal_age from sashelp.class;
quit;
Calculated is only needed when there is an actual calculation - ie, for b_Dt. c_dt is only a rename of infcondt2, so you can use c_dt (or infcondt2) interchangeably, and cannot use CALCUALTED.
data test;
input dt1 :date9. dt2 :date9.;
datalines;
01JAN2010 01FEB2011
01DEC2011 03FEB2012
;;;;
run;
%macro months(somedate,birth);
intck('month',&birth,&somedate) - (day(&somedate) < day(&birth))
%mend months;
proc sql;
create table test2 as
select dt1 -1 as b_dt, dt2 as c_dt, %months(calculated b_dt, c_dt) as third
from test;
quit;
That said, you no longer need to adjust for day-of-month if you have 9.2 or sooner - look at the documnentation for INTCK. There is an optional argument (in 9.3 it's called METHOD, I think 9.2 calls it something different) that allows you to force it to use a continuous month-concept rather than a discrete concept (counting first-of-the-months is discrete, for example, as the default used to be).
Also, I don't understand the point of the DAY macro - not only are DAYs integers (so you can just subtract the two numbers using normal subtraction), but why are you subtracting the day/day like in %month? That's to correct for the part-of-month, as I just discussed, and is not needed for days (it would yield a wrong answer where somedate < birth).
Example of correct code:
%macro months(somedate,birth);
intck('month',&birth,&somedate)
- (day(&somedate) < day(&birth))
%mend months;
%macro days(somedate,birth);
intck('day',&birth,&somedate)
- (day(&somedate) < day(&birth))
%mend days;
%macro sum(part1, part2);
&part1*16 + &part2
%mend sum;
data enrolled;
input
id infcondt2 :date9.
;
datalines;
1 01JAN2011
2 02JAN2011
3 03MAR2011
;;;;
run;
data demographics;
input
id
birthweightlbs
birthweightoz
birthdt :datetime17.
;
datalines;
1 8 14 04MAR2011:15:13:14
2 7 13 05MAR2011:15:13:14
3 6 15 06MAR2011:15:13:14
;;;;
run;
proc sql;
create table demos as
select x.*, infcondt2 as c_dt,
y.*, datepart(birthdt) as b_dt format date9. ,
birthweightlbs as lbs,
birthweightoz as oz,
lbs*16 + oz as tot_oz,
%sum(lbs,oz) as tot_oz_m,
%months(c_dt, calculated b_dt) as age_m,
c_dt - calculated b_dt as age_d, /* %days is almost certainly wrong here */
case
when calculated age_m le 3 then 1
when 3 le calculated age_m le 6 then 2
else 3
end as age_interval
from enrolled as x,
demographics as y
where x.id = y.id ;
quit;
Upvotes: 3
Reputation: 1304
If you want to do complicated data step logic on your proc sql
variables, the best way I know is to use proc fcmp
to define the function instead of a macro.
e.g.
proc fcmp outlib=work.util.Dates;
function months(somedate, birth);
return (intck('month',&birth,&somedate) - (day(&somedate) < day(&birth)));
endsub;
run;
options cmplib=work.util;
proc sql;
create table demos as
select x.*, infcondt2 as c_dt,
y.*, datepart(visitdt) as v_dt format date9. ,
datepart(birthdt) as b_dt format date9. ,
birthweightlbs as lbs,
birthweightoz as oz,
lbs*16 + oz as tot_oz,
months(c_dt, b_dt) as age_m,
case
when age_m le 3 then 1
when age_m le 6 and age_m gt 3 then 2
else 3
end as age_interval
from enrolled as x,
demographics as y
where x.center = y.center and x.id = y.id ;
quit;
You can use the months()
function you created in data
steps, proc sql
, and in some other places too.
Upvotes: 3