Reputation: 293
I am new to SAS and would like to keep only what's in between the parentheses of another column in my new column using compress and SCAN:
x
abc(animal)
efg(food)
hij(plant)
I tried:
DATA NEW
set OLD;
y = (compress(scan([x], 2, '('), ')');
RUN;
PROC PRINT NEW;
RUN;
but I can't seem to get it to work. Any insight would be helpful.
Upvotes: 2
Views: 663
Reputation: 3315
@tom answer is perfect and another way to do this is using prxsubstr.
data have;
input x $20.;
datalines;
abc(animal)
efg(food)
hij(plant)
;
prxparse \(.+?\) is to find anything between parenthesis
\( -- starting of parenthesis
.+?\( --- anything till closing parenthesis
call prxsubstr captures where it finds pattern and length of it. then doing substr to get the values inside the parenthesis
data want;
set have;
if _n_= 1 then do;
retain re;
re = prxparse('/\(.+?\)/');
end;
call prxsubstr(re, trim(x), position, length);
if position gt 0 then
new=substr(trim(x),position+1, length-2);
run;
Upvotes: 3
Reputation: 51611
There should be no need for compress()
, unless you want to remove some other characters from the value extracted. You might want to use left()
to remove leading spaces from values like xxx( yyy )
.
y=scan(x,2,'()');
If you have some values like (no first word)
then you might need to use index of 1 instead of two. If so you could use =:
to test if X starts with an open paren.
y=scan(x,1+(x^=:'('),'()');
If you know the () are always at the end you could use index of -1. But again if some strings had characters after the close paren then you would need to use -2 instead.
Upvotes: 2