Kay
Kay

Reputation: 365

Compress Newline character for dynamic varaibles

Dataset: Have
F1      F2

Student Section
Name    No

Dataset "Have". Data has new line character.

I need to compress the newline character from the data.

I want to do this dynamically as sometimes the "Have" dataset may contain new variables like F3,F4,F5 etc.,

I have written as macro to do this.. However it is not working as expected.

When i execute the below code, first time I am getting error as invalid reference newcnt. If i execute for second time in the same session, i am not getting error.

PFB my code:

%macro update_2(newcnt);

data HAVE;
    set HAVE;
    %do i= 1 %to &newcnt;
        %let colname = F&i;
        &colname=compress(&colname,,'c');
    %end;
run;

%mend update_2;


%macro update_1();

proc sql noprint;
    select count(*) into :cnt from dictionary.columns where libname="WORK" and memname="HAVE"; 
quit;

%update_2(&cnt)

%mend update_1;

Note: All the variables have name as F1,F2,F3,F4.,

Please tell me what is going wrong..

If there is any other procedures, please help me.

Upvotes: 2

Views: 799

Answers (4)

Robert Penridge
Robert Penridge

Reputation: 8513

We created our own function to clean unwanted characters from strings using proc fcmp. In this case, our function cleans tab characters, line feeds, and carriage returns.

proc fcmp outlib=common.funcs.funcs; /* REPLACE TARGET DESTINATION AS NECESSARY */

  function clean(iField $) $200;    
    length cleaned $200;
    bad_char_list = byte(10) || byte(9) || byte(13);  
    cleaned = translate(iField," ",bad_char_list);
    return (cleaned );
  endsub;

run;

Create some test data with a new line character in the middle of it, then export it and view the results. You can see the string has been split across lines:

data x;
  length employer $200;
  employer = cats("blah",byte(10),"diblah"); 
run;

proc export data=x outfile="%sysfunc(pathname(work))\x.csv" dbms=csv replace;
run;

Run our newly created clean() function against the string and export it again. You can see it is now on a single line as desired:

data y;
  set x;
  employer = clean(employer);
run;

proc export data=y outfile="%sysfunc(pathname(work))\y.csv" dbms=csv replace;
run;

Now to apply this method to all character variables in our desired dataset. No need for macros, just define an array referencing all the character variables, and iterate over them applying the clean() function as we go:

data cleaned;
  set x;
  array a[*] _char_;
  do cnt=lbound(a) to hbound(a);
    a[cnt] = clean(a[cnt]);
  end;
run;

EDIT : Also note that fcmp may have some performance considerations to consider. If you are working with very large amounts of data, there may be other solutions that will perform better.

EDIT 6/15/2020 : Corrected missing length statement that could result in truncated responses.

Upvotes: 2

Joe
Joe

Reputation: 63434

Here's an example of Robert Penridge's function, as a call routine with an array as an argument. This probably only works in 9.4+ or possibly later updates of 9.3, when permanent arrays began being allowed to be used as arguments in this way.

I'm not sure if this could be done flexibly with an array as a function; without using macros (which require recompilation of the function constantly) I don't know how one could make the right size of array be returned without doing it as a call routine.

I added 'Z' to the drop list so it's obvious that it works.

options cmplib=work.funcs;

proc fcmp outlib=work.funcs.funcs;
  sub clean(iField[*] $);  
    outargs iField; 
    bad_char_list = byte(11)|| byte(10) || byte(9) || byte(13)||"Z";  
    do _i = 1 to dim(iField);
      iField[_i] = translate(iField[_i],trimn(" "),bad_char_list);
    end;
  endsub;
quit;


data y;
  length employer1-employer5 $20;
  array employer[4] $; 
  do _i = 1 to dim(employer);
    employer[_i] = "Hello"||byte(32)||"Z"||"Goodbye";
  end;
  employer5 = "Hello"||byte(32)||"Z"||"Goodbye";
  call clean(employer);
run;

proc print data=y;
run;

Upvotes: 2

Haikuo Bian
Haikuo Bian

Reputation: 906

Here is another alternative. If newline is the only thing you want to remove, then we are talking about Char only, you may leverage implicit array and Do over,

data want;
set have;
array chr _character_;
do over chr;
chr=compress(chr,,'c');
end;
run;

Upvotes: 1

user667489
user667489

Reputation: 9569

In your macro %update_1 you're creating a macro variable called &cnt, but when you call %update_2 you refer to another macro variable, &colcnt. Try fixing this reference and see if your code behaves as expected.

Upvotes: 3

Related Questions