Reputation: 141
I have the following dataset
DATA EXAMPLE1;
INPUT Names $char30.;
DATALINES;
AARON RAY, MD INC
AARON,RAY MD (1371927)
RAY,AARON,MD
;
run;
I want to delete all characters after 'MD'. Expecting below output
Names Want_names
AARON RAY, MD INC AARON RAY, MD
AARON,RAY MD (1371927) AARON,RAY MD
RAY,AARON,MD RAY,AARON,MD
Any suggestions?
Thanks
Upvotes: 2
Views: 2341
Reputation: 63434
Probably overkill, but you certainly could use a regular expression for this.
data want;
set example1;
rx_name = prxparse('s~(.*)([ ,]+MD)(.*)~$1$2~ios');
new_name = prxchange(rx_name,1,names);
run;
I grab three groups: the part before MD, then at least one of space or comma and MD, then all the rest of the characters; then replace with just the first two groups.
Upvotes: 1
Reputation: 502
In this case, just keep the characters your want instead of deleting what you don't want.
You could use the find
function (with the c
) to determine where your target string (MD) starts.
spot = findc(names,'MD');
Now you have a variable that determines where your target string begins.
Taking advantage of the substr
function you can now control the exact spot that you want the string to end (essentially cutting out the MD
characters), and leaving you with what's left after your remove your target (from the point of spot
and on).
Given that you want the entire string prior to your target character MD
, you'll control your substring argument by setting your end position from your spot
, minus 1 position:
want = substr(names,1,spot-1);
To verify your position of spot
, control your data step to keep your pointer variable spot
for reference.
want
will re-create your string with your target MD
removed.
In an effort to clean the string (if necessary), you could use some other functions to remove the now trailing special characters (like the ,
at the end of your string for records 1 and 3).
data search;
set example1;
spot = findc(names,'MD');
want = substr(names,1,spot-1);
/*drop spot;*/
run;
Upvotes: 0