Jonah Bishop
Jonah Bishop

Reputation: 12581

Stripping a middle initial with a regex

I know I'm doing something stupid here, but I'm tired and I'm apparently just not seeing it. I have the following script:

#!/usr/bin/perl
use strict;
use warnings;

my @names = (
    "John Q. Public",
    "James K Polk"
);

foreach (@names)
{
    print "Before: $_\n";
    s/\b[A-Z]\.?\b//;
    print "After:  $_\n";
}

When I run this script, I get the following output:

Before: John Q. Public
After:  John . Public      <== Why is the period still here?
Before: James K Polk
After:  James  Polk

Note that in the John Q. Public example, the period is left. Isn't the optional match argument (?) greedy? According to the perlre docs:

? Match 1 or 0 times

Shouldn't the period disappear along with the middle initial? What am I missing here?

Upvotes: 0

Views: 532

Answers (2)

Borodin
Borodin

Reputation: 126722

I think I would choose to split the name on whitespace and select just the first and last fields.

Like this:

use strict;
use warnings;

my @names = ("John Q. Public", "James K Polk");

foreach (@names) {
  print "Before: $_\n";
  $_ = join ' ', (split)[0, -1];
  print "After:  $_\n";
}

output

Before: John Q. Public
After:  John Public
Before: James K Polk
After:  James Polk

Upvotes: 1

choroba
choroba

Reputation: 241918

The problem is

". " =~ /\.\b/ or print "There is no word boundary between a dot and a space.\n"

Upvotes: 3

Related Questions