Stef_MOE
Stef_MOE

Reputation: 15

SAS Variables in Array

Using a data step, I want to include a variable as array parameter.

Why? I want to declare an array, then fill it up with data (the needed length for the array is unknown/can change in the future). My current dataset looks like:

Row1: "val1=x val2=y val3=xx val4=yy" etc.
Row2: "val1=x"
Row3: "val1=x val2=y"

Now, I want to create columns, using an array for the valX read-outs. So a data step is used to count the number of occurrences of "val", which gave me a count column with the number of counts.

(1) Now I want to build an expression like: "array{count}", or "array{max(of count)}"; how can this be achieved? Since array needs an integer input?

(2) Another option would be to create an array{100} and then simply drop columns with only missing values, how can this be achieved?

Sample of the data step:

data count;
set input;
counter = count(column,'val','i');
run;
data output;
set count;
array Values{100};
do i = 1 to counter;
Values(i) = scan(column,i+1);
end;
run;

Upvotes: 0

Views: 1356

Answers (2)

Sashole
Sashole

Reputation: 1

Quentin has all the right ideas. However, there's some room for improvement:

  1. There's no need for the extra SQL step (and hence extra pass through the data) since the needed macro variables can be populated in the first DATA step.
  2. If any value exceeds 8 in length, the array element length specification $ will truncate it because it defaults to 8. The remedy is to compute the max array element length in the same step as the array dimension.

For example:

data count (keep = column) ;                             
  infile cards eof = eof ;                               
  input column $50. ;                                    
  retain dim len ;                                       
  do _n_ = 1 to countW (column) ;                        
    len = len max length (scan (column, 2 * _n_, "= ")) ;
  end;                                                   
  dim = dim max (_n_ - 1) ;                              
  return ;                                               
  eof: call symputx ("dim", dim) ;                       
       call symputx ("len", len) ;                       
       delete ;                                          
  cards;                                                 
val1=x val2=y val3=xx val4=yy12345678                    
val1=x                                                   
val1=x val2=y                                            
;                                                        
run ;                                                    

data output ;                                            
  set count ;                                            
  array values [&dim] $ &len ;                           
  do _n_ = 1 to countW (column) ;                        
    values[_n_] = scan (column, 2 * _n_, "= ") ;         
  end;                                                   
run ;

Best regards,

Paul Dorfman

Upvotes: 0

Quentin
Quentin

Reputation: 6378

After you create COUNTER, you can find the maximum value of counter and read that into a macro variable, then use the macro variable to define the size of your array.

data count;
  input column $50.;
  counter = count(column,'val','i');
  cards;
val1=x val2=y val3=xx val4=yy
val1=x
val1=x val2=y
;
run;

proc sql ;
  select max(counter) into :maxcount
  from count
  ;
quit ;

%put &=maxcount ;

data output;
  set count;
  array Values{&maxcount} $;
  do i = 1 to counter;
    Values(i) = scan(column,2*i,'= ');
  end;
  drop i ;
run;

SAS needs to know the size of the array when the data step compiles, which is why you can't use a data step variable to define the size of an array.

Upvotes: 2

Related Questions