Yumi
Yumi

Reputation: 25

SAS: How to delete word between two specific position?

data:

Hell_TRIAL21_o World
Good Mor_Trial9_ning

How do I remove the _TRIAL21_ and _TRIAL9_?

What I did was I find the position of the first _ and the second _. Then I want to compress from the first _ and second _. But the compress function is not available to do so. How?

x = index(string, '_');
if (x>0) then do;
    y = x+1; 
    z = find(string, '_', y);
end;

Upvotes: 1

Views: 1412

Answers (2)

Longfish
Longfish

Reputation: 7602

PERL regular expressions are a good way of identifying these sort of strings. call prxchange is the function that will remove the relevant characters. It requires prxparse beforehand to create the search and replace parameters.

I've used modify here to amend the existing dataset, obviously you may want to use set to write out to a new dataset and test the results first.

data have;
input string $ 30.;
datalines;
Hell_TRIAL21_o World
Good Mor_Trial9_ning
;
run;


data have;
modify have;
regex = prxparse('s/_.*_//'); /* identify and remove anything between 2 underscores */
call prxchange(regex,-1,string);
run;

Or to create a new variable and dataset, just use prxchange (which doesn't require prxparse).

data want;
set have;
new_string = prxchange('s/_.*_//',-1,string);
run;

Upvotes: 2

yukclam9
yukclam9

Reputation: 336

Text= " Hell_TRIAL21_o World Good Mor_Trial9_ning"

var= catx("",scan(text,1,"_"),"__",scan(text,3,"_"),"_", scan(text,5,"_"))

Note that the length of variable var may not be desirable to your case.Remember to adjust accordingly.

Upvotes: 3

Related Questions