CheeseConQueso
CheeseConQueso

Reputation: 6041

How can I remove middle initial with a dot at the end?

I've got a bunch of first names in a field that carry a middle initial with a '.' at the end..

I need a regex to convert this example:

Kenneth R.

into

Kenneth

I was trying to build my own and found this useful site btw..

http://www.gskinner.com/RegExr/

but I'm new to Perl & regular expressions and could only get "...$" - which is useless when there is no middle initial at the end of the first name....


i just found another name format that needs consideration... 'R. Kelly' needs to be 'Kelly'

Upvotes: 4

Views: 2772

Answers (2)

Jon Ericson
Jon Ericson

Reputation: 21515

To take care of the R. Kelly case:

s/\w\. *//g

Here's a quick test:

$ echo 'R. Kelly
Kenneth R.
R. Kemp R.
John Q. Smith' | perl -pe 's/\w\. *//g'
Kelly
Kenneth 
Kemp 
John Smith

I'd suggest that:

  1. The global option (g) is required.
  2. The case insensitive option (i) isn't.
  3. You might consider looking for upper case ([:upper:]) initials only.
  4. Multiple character "initials" should be viewed with suspicion. (So w+ is probably a mistake unless your data has relevant cases.)
  5. Read perldoc perlre for more information.

Upvotes: 2

siukurnin
siukurnin

Reputation: 2902

To remove the last "word" if it ends with dot :

my $name =~ s/\w+\.$//i;

(this supposes you don't have any space after that)

To remove any word ending with dot :

my $name =~ s/\w+\.//i;

look at the /g modifier if you want to remove them all ...

and BTW make yourself a test case list to check your solution then try with real word data, you probably will get some surprises ...

Upvotes: 3

Related Questions