Qimeng Zhang
Qimeng Zhang

Reputation: 31

how to do IN with an array in SAS

I am defined four variables here and each of the variables with different number of ICD10 codes:

%LET DX_27800_CODE = 'E6609', 'E661', 'E668', 'E669';

%LET DX_27801_CODE = 'E6601';

%LET DX_2859_CODE = 'D649';

%LET DX_6202_CODE = 'N8320', 'N8329';

now I want to use create an array that can easy mapping those variables that with my icd 10 table columns so that I could assign flags variables with it. the regular way would be:

data test; set input;

if (dx1 in ( &DX_27800_CODE) or dx2 in (&DX_27800_CODE) or dx3 in (&DX_27800_CODE)) 

  then dx_27800 = 1; else dx_27800 =0;

run;

in the regular way I would need to do this procedure four times to get all four flags variable. So I'm wondering if it could be done by using array.

data test; set input;

  array dx_code10 [4] &DX_27800_CODE &DX_27801_CODE &DX_2859_CODE &DX_6202_CODE;

  ARRAY DX_VARIABLE[4] DX_27800 DX_27801 DX_2859 DX_6202;

  DO I = 1 TO DIM(dx_code10);

  IF (DX1 IN (DX_CODE10[I]) OR DX2 IN (DX_CODE10[I]) OR DX3 IN (DX_CODE10[I]))

  THEN DX_VARIABLE[I] = 1; 

  ELSE DX_VARIABLE[I] = 0; 
END;
END;
RUN;

But seems like it can't be done by this way. Please help me to solve this problem. thanks.

Upvotes: 1

Views: 63

Answers (2)

Joe
Joe

Reputation: 63424

I think a better approach is to use formats. I'd rather have those DX codes in a spreadsheet or a text file or something, and then input that to make the formats, but even with the not-best-practice %LETs, you can still use a format solution.

Approach is to make a format that turns each of those DX code pairs into a value that returns the dx value (the 27800, 27801, etc.); then use that to drive how you assign the followup array.

%LET DX_27800_CODE = 'E6609', 'E661', 'E668', 'E669';
%LET DX_27801_CODE = 'E6601';
%LET DX_2859_CODE = 'D649';
%LET DX_6202_CODE = 'N8320', 'N8329';


proc format;
  value $dxcode
    &dx_27800_code = '27800'
    &dx_27801_code = '27801'
    &dx_2859_code  = '2859'
    &dx_6202_code   = '6202'
    other=' '
    ;
quit;

data input;
  input dx1 $;
  datalines;
E6601
E6609
E6608
E661
E668
D649
D650
N8320
E669
N8329
;;;;
run;


data want;
  set input;
  array dx_codes[4] dx_27800 dx_27801 dx_2859 dx_6202;
  dx_code_val = put(dx1,$dxcode5.);
  do _i = 1 to dim(dx_codes);
    if dx_code_val = scan(vname(dx_codes[_i]),2,'_') then dx_codes[_i]=1;
    else dx_codes[_i]=0;
  end;
run;

Upvotes: 1

Tom
Tom

Reputation: 51566

For your specific example you could use FINDW() function instead of the IN operator. Turn your code lists into delimited strings instead.

%LET DX_27800_CODE = E6609,E661,E668,E669;
%LET DX_27801_CODE = E6601 ;
%LET DX_2859_CODE = D649 ;
%LET DX_6202_CODE = N8320,N8329;

data test;
  set input;
  array dx_code_list (4) $200 _temporary_ ("&dx_27800_code" "&dx_27801_code" "&dx_2859_code" "&dx_6202_code");
  array dx_variable (4) dx_27800 dx_27801 dx_2859 dx_6202;
  array dx dx1-dx3 ;
  do i = 1 to dim(dx_variable);
    dx_variable(i)=0;
    do j=1 to dim(dx) while (dx_variable(i)=0);
      if findw(dx_code_list(i),dx(j),',','it') then dx_variable(i)=1;
    end;
  end;
  drop i j;
run;

So if I make some sample data.

data input ;
   length dx1-dx3 $7 ;
   input dx1 - dx3 ;
cards;
E6609 E661 .
E668 E669 .
E6601 . . 
D649 N8320 N8329
. . .
;

I get this result:

enter image description here

Upvotes: 0

Related Questions