M vP
M vP

Reputation: 9

Remove Middle Initial but not Middle Name from string

I'm trying to find a way to remove the Middle initial from a string containing the First name and middle initial (example "Mary A" needs to be "Mary").

However, I would need to keep the middle/second name if it was more than an initial (example "Mary Ann" would stay "Mary Ann").

Much thanks,

Matt

Upvotes: 0

Views: 555

Answers (2)

hanna
hanna

Reputation: 655

Try to use the function scan:

data test;
    input name $20.;
    cards;
    Mary A
    Anthony B
    Mary Ann
    Anthony Bernard
    ;
run;

data res;
    set test;
    if (length(scan(name,2))=1) then name=scan(name,1);
run;

As a result, you get:

Mary
Anthony
Mary Ann
Anthony Bernard

Upvotes: 1

user667489
user667489

Reputation: 9569

Here's an example of how to do this using regular expression substitution. I've used proc sql but this would also work in a data step:

data names;
input name & $5.;
cards;
Aa A
Aa Aa
Aaa A
;
run;

proc sql;
  select prxchange('s/^(\w+)\s+\w\s*$/$1/',-1,name) from names;
quit;

The regex is built up as follows:

  • Capture the first word
  • Match a space, a single character, then any number of trailing spaces
  • If the whole expression is a match, return only the first word, otherwise return the whole input string unchanged.

Upvotes: 0

Related Questions