user3493641
user3493641

Reputation: 11

SAS: Do-loop and Array

I have a data set of person records, organized into households.

The goal is to create a new variable, called NCH, if a person is 65 years or older, and 1 means lives with son(s) 2 means lives with daughter(s) 3 means lives with both son(s) and daughter(s) . means does not live with child

I think an array with do loop should be able to accomplish this, but I have very little experience doing this in SAS. Does anyone know how to do this?

Upvotes: 0

Views: 290

Answers (1)

catquas
catquas

Reputation: 720

I came up with two solutions, one using proc sql and one using the data step. Here is the proc sql solution:

proc sql;
    create table new as 
    select *, 
        case when 'M' in (select b.sex from people as b where a.serial=b.serial 
            and (a.pernum = b.momloc or a.pernum = b.poploc)) 
            then 1 else 0 end as son,
        case when 'F' in (select c.sex from people as c where a.serial=c.serial 
            and (a.pernum = c.momloc or a.pernum = c.poploc))
            then 1 else 0 end as daughter,
        case when calculated son = 1 and calculated daughter = 0 then 1
            when calculated son = 0 and calculated daughter = 1 then 2
            when calculated son = 1 and calculated daughter = 1 then 3
            else .
            end as nch
    from people as a
    where age >= 65;
quit;

The "case" expression is like "if then" in regular SAS. The main query is against the table people, which is given an "alias" of a. The select statements in parentheses are called sub-queries. They query the table people with an alias of b and c. They sub-queries are to check weather there exists a male (or female) where serial in a equals serial in b (or c), and pernum in a equals momloc or poploc in b (or c).

Here is the data step solution:

data test;
    set people;
    where age >= 65;
    son = 0;
    daughter = 0;
    do i = 1 to nobs;
        set people (keep=serial momloc poploc sex age 
            rename=(serial=serial1 momloc=momloc1 poploc=poploc1 sex=sex1 age=age1)) 
            point=i nobs=nobs;
        if sex1 = 'M' and serial=serial1 and (pernum = momloc1 or pernum = poploc1) 
            then son = 1;
        if sex1 = 'F' and serial=serial1 and (pernum = momloc1 or pernum = poploc1) 
            then daughter = 1;
    end;
    if son = 1 and daughter = 0 then nch = 1;
    else if son = 0 and daughter = 1 then nch = 2;
    else if son = 1 and daughter = 1 then nch = 3;
    else nch = .;
    drop serial1 momloc1 poploc1 sex1 age1 son daughter;
run;

Here the data step is cycling through the set "people" one record at a time, and for each record it cycles through every record the same data set "people". "Nobs" is the number of observations in "people", and "point" is the number corresponding to each record.

Upvotes: 0

Related Questions