Reputation: 25
data:
Hell_TRIAL21_o World
Good Mor_Trial9_ning
How do I remove the _TRIAL21_
and _TRIAL9_
?
What I did was I find the position of the first _ and the second _. Then I want to compress from the first _ and second _. But the compress function is not available to do so. How?
x = index(string, '_');
if (x>0) then do;
y = x+1;
z = find(string, '_', y);
end;
Upvotes: 1
Views: 1412
Reputation: 7602
PERL regular expressions are a good way of identifying these sort of strings. call prxchange
is the function that will remove the relevant characters. It requires prxparse
beforehand to create the search and replace parameters.
I've used modify
here to amend the existing dataset, obviously you may want to use set
to write out to a new dataset and test the results first.
data have;
input string $ 30.;
datalines;
Hell_TRIAL21_o World
Good Mor_Trial9_ning
;
run;
data have;
modify have;
regex = prxparse('s/_.*_//'); /* identify and remove anything between 2 underscores */
call prxchange(regex,-1,string);
run;
Or to create a new variable and dataset, just use prxchange
(which doesn't require prxparse
).
data want;
set have;
new_string = prxchange('s/_.*_//',-1,string);
run;
Upvotes: 2
Reputation: 336
Text= " Hell_TRIAL21_o World Good Mor_Trial9_ning"
var= catx("",scan(text,1,"_"),"__",scan(text,3,"_"),"_", scan(text,5,"_"))
Note that the length of variable var
may not be desirable to your case.Remember to adjust accordingly.
Upvotes: 3