hkk
hkk

Reputation: 13

fill in with same value each group in sas

I am trying to fill in ID1 variable with same ID number when rdq=adq for each permco in SAS. Here is an example of my data.

    permco   rdq    adq       ID       ID1   
      1      333    331        1        .
      1      333    332        2        .   
      1      333    333        3        3  
      1      333    334        4        .
      1      333    335        5        .
      1      333    336        6        . 
      1      555    552        1        . 
      1      555    553        2        . 
      1      555    554        3        . 
      1      555    555        4        4 
      1      555    556        5        . 
      1      555    557        6        . 
      1      555    558        7        .
      2      333    331        1        .
      2      333    332        2        .   
      2      333    333        3        3  
      2      333    334        4        .
      2      333    335        5        .
      2      333    336        6        . 
      2      555    552        1        . 
      2      555    553        2        . 
      2      555    554        3        . 
      2      555    555        4        4 
      2      555    556        5        . 
      2      555    557        6        . 
      2      555    558        7        .

And what I desire to have is...

    permco   rdq    adq       ID       ID1   
      1      333    331        1        3
      1      333    332        2        3   
      1      333    333        3        3  
      1      333    334        4        3
      1      333    335        5        3
      1      333    336        6        3 
      1      555    552        1        4 
      1      555    553        2        4 
      1      555    554        3        4 
      1      555    555        4        4 
      1      555    556        5        4 
      1      555    557        6        4 
      1      555    558        7        4
      2      333    331        1        3
      2      333    332        2        3   
      2      333    333        3        3  
      2      333    334        4        3
      2      333    335        5        3
      2      333    336        6        3 
      2      555    552        1        4 
      2      555    553        2        4 
      2      555    554        3        4 
      2      555    555        4        4 
      2      555    556        5        4 
      2      555    557        6        4 
      2      555    558        7        4

I would like to fill in ID1 with ID number when rdq=adq.

Upvotes: 1

Views: 1632

Answers (4)

user667489
user667489

Reputation: 9569

Double DoW loop solution:

data have01;
infile cards truncover expandtabs;
input permco   rdq    adq       ID       ID1  ;
cards;
      1      333    331        1        .
      1      333    332        2        .   
      1      333    333        3        3  
      1      333    334        4        .
      1      333    333        5        5
      1      333    336        6        . 
      1      555    552        1        . 
      1      555    553        2        . 
      1      555    554        3        . 
      1      555    555        4        4 
      1      555    556        5        . 
      1      555    557        6        . 
      1      555    558        7        .
      2      333    331        1        .
      2      333    332        2        .   
      2      333    333        3        3  
      2      333    334        4        .
      2      333    335        5        .
      2      333    336        6        . 
      2      555    552        1        . 
      2      555    553        2        . 
      2      555    554        3        . 
      2      555    555        4        . 
      2      555    556        5        . 
      2      555    557        6        . 
      2      555    558        7        .
;
run;

data want;
do _n_ = 1 by 1 until (last.rdq);
    set have01;
    by permco rdq;
    if first.rdq then call missing(ID1);
    if adq = rdq then t_ID1 = ID1;
    drop t_ID1;
end;
do _n_ = 1 to _n_;
    set have01;
    ID1 = t_ID1;
    output;
end;
run;

This assumes that if there are multiple matches, the last one should take precedence. If there are no matches then every row for that group gets a missing value.

Upvotes: 1

Johannes Bleher
Johannes Bleher

Reputation: 331

I assume you want to have the same number within the by-group defined by permco and rdq. The possibility to have two or more matches within one by group is handled with a variable matchespergroup. If no matches are found in one group id1 is a missing value.

data have01;
infile cards truncover expandtabs;
input permco   rdq    adq       ID       ID1  ;
cards;
      1      333    331        1        .
      1      333    332        2        .   
      1      333    333        3        3  
      1      333    334        4        .
      1      333    333        5        5
      1      333    336        6        . 
      1      555    552        1        . 
      1      555    553        2        . 
      1      555    554        3        . 
      1      555    555        4        4 
      1      555    556        5        . 
      1      555    557        6        . 
      1      555    558        7        .
      2      333    331        1        .
      2      333    332        2        .   
      2      333    333        3        3  
      2      333    334        4        .
      2      333    335        5        .
      2      333    336        6        . 
      2      555    552        1        . 
      2      555    553        2        . 
      2      555    554        3        . 
      2      555    555        4        4 
      2      555    556        5        . 
      2      555    557        6        . 
      2      555    558        7        .
run;


data want(drop=rv);
if 0 then set have01;
if _N_=1 then 
    do;
        declare hash hh(dataset:"have01(where=(adq=rdq))",ordered:'A',multidata:'Y');
        hh.definekey('permco','rdq');
        hh.definedata('id1');
        hh.definedone();
    end;
do until(theend);
    set have01 end=theend;
    rv = hh.find();
    hh.has_next(result: matchespergroup);
    if rv=0 then do; matchespergroup+1; output;end;
    else do; id1 = .;output;end;
end;
run;

Upvotes: 0

Sean
Sean

Reputation: 1120

I think loops are unnecessary here. All you need to do is find the value you want in each group and merge it back onto the original dataset:

proc sql;
    create table temp as select distinct
        permco, rdq, id
        from have (where = (rdq = adq));
quit;

proc sql;
    create table want as select distinct
        a.*, b.id as id_filled
        from have as a
        left join temp as b
        on a.permco = b.permco and a.rdq = b.rdq;
quit;

Upvotes: 0

Hitesh Kumar
Hitesh Kumar

Reputation: 1

If your data is like for a combination of Permco and RDQ you have unique value of ID 1 and you need to fill same value through out for for that combination :

An Alternative can be :

  • Create a separate data set having only 3 columns : Permco ,RDQ and ID1 ,remove the rows having blank ID1 .
  • Merge your input data with this data on Permco and RDQ .

    Data intermediate ; set input_data (keep Permco RDQ ID1); If ID1=. then delete; Run;

    proc sort data=input_data out=input_data_1(drop ID1);by Permco RDQ;run;

    proc sort data=intermediate;by Permco RDQ;run;

    data final; merge input_data_1(in=a) intermediate(in=b); by Permco RDQ; if a; run;

Upvotes: 0

Related Questions