meuleman
meuleman

Reputation: 378

Perl word disenvoweling: removing all vowels except the first and last

In order to shorten a number of names, but still keep them somewhat readable, I would like to remove all vowels from strings, except for the first and last occurrence. For example, I'd like 'Minnesota' to become 'Minnsta'.

my $name="Minnesota";

I've tried to make use of Perl's zero-width look-behind regex syntax, like so:

$name =~ s/(?<=[aeiou])([^aeiou]*)[aeiou]/$1/ig; # minnst

However, although this properly takes care of the first vowel, it removes the last one. To fix this, I tried to keep the last vowel in place, like this:

$name =~ s/(?<=[aeiou])([^aeiou]*)([aeiou])([aeiou][^aeiou]*)$/$1$3/ig; # minnesota

This also didn't work, presumably because the '$' anchors the whole regex to the end of the string.

Of course, I could look up the position of the first vowel, reverse the rest of the string and remove all vowels except for the 'first' (last), and re-reverse and concatenate the strings, but that's not very elegant. I feel I'm overlooking one of the options of the zero-width syntax.

Upvotes: 2

Views: 1224

Answers (3)

emax
emax

Reputation: 387

For me, this one works (the '1' in front is intentional):

1 while ($name =~ s/^(.+)[AEIOUaeiou]/$1/g );

if you want to keep a minimum length of $name (e.g. 3):

1 while (length $name > 3 && $name =~ s/^(.+)[AEIOUaeiou]/$1/g );

Instead of writing 'AEIOUaeiou' you can of course use the 'i' flag to ignore the case. I wrote it down explicitly for easier reading.

You can of course put any characters in the brackets.

Upvotes: 1

marneborn
marneborn

Reputation: 699

Make sure that there is a vowel after the MATCH, but exclude it from the MATCH.

$name =~ s/(?<=[aeiou])([^aeiou]*)[aeiou](?=.*[aeiou])/$1/ig;

The substitutions done by your regexp are:

  • Minnesota => nne -> nn => Minnsota
  • Minnsota => nnso -> nns => Minnsta
  • Minnsta => nnsta -> nnst => Minnst
  • Minnst => nnsta -> nnst => Minnst

So the last substitution swaps 'nnsta' with 'nnst'.

my $name="Minnesota";
my $prev = '';
while ( $name ne $prev ) {
    $prev = $name;
    $name =~ s/(?<=[aeiou])([^aeiou]*)[aeiou]/$1/i;
    print "$prev => ${^MATCH} -> $1 => $name\n";
}

Upvotes: 0

Miller
Miller

Reputation: 35208

Just specify a ending boundary condition for your regex: (?![^aeiou]*$):

use strict;
use warnings;

my @words = qw(Minnesota concatenate strings elegant I feel overlooking options syntax any greatly appreciated);

for (@words) {
    my $word = $_;

    $word =~ s/(?<=[aeiou])([^aeiou]*)[aeiou](?![^aeiou]*$)/$1/ig;

    printf "%-12s -> %s\n", $_, $word;
}

Outputs:

Minnesota    -> Minnsta
concatenate  -> conctnte
strings      -> strings
elegant      -> elgant
I            -> I
feel         -> feel
overlooking  -> ovrlking
options      -> optons
syntax       -> syntax
any          -> any
greatly      -> greatly
appreciated  -> apprcted

Upvotes: 3

Related Questions