Allen
Allen

Reputation: 471

Regular expression to capture last occurrence of a pattern

I tried several ways for last occurrence, but they are not working. The following is my case,

abc def = ghi
abc def ghi = jkl
abc def ghi=jkl mno

For the first line, my capture target is "def". For the second line, my capture target is "ghi", and for the 3rd line, my capture target is "ghi". The target can be verbally expressed as "the last occurrence of word before equal sign".

How does the regular expression of Perl should look like?

Upvotes: 3

Views: 20682

Answers (4)

Shawn Darichuk
Shawn Darichuk

Reputation: 390

Jack's answer is probably the best, but I can't wrap my head around how it works. I like breaking things down into smaller chunks.

use warnings;
use strict;

my @strings = ( "abc def = ghi",
                "abc def ghi = jkl",
                "abc def ghi=jkl mno"
                );
#
foreach (@strings) {
    my $last = get_last($_);
    print "$last\n";
}

sub get_last {
    my $string = shift;
    # group things as left side or right side
    my $left_side;
    my $right_side;
    if ($string =~ /(.*)=(.*)/) {
        $left_side = $1;
        $right_side = $2;
    }

    # split things according to whitespace and store in an array
    my @left_side = split (/\s+/, $left_side);

    # return the last element of that array
    return $left_side[-1];
}

Upvotes: 1

ikegami
ikegami

Reputation: 385645

\b(\w+)\s*= would suffice for your examples. It matches a word optionally immediately followed by whitespace, immediately followed by =. The \b reduces backtracking.

\b(\w+)[^\w=]*= matches your "verbal expression" more precisely. For example, it will match abc in abc !@# = def.

  • \b matches between a \w and \W.
  • \w matches a non-word character.
  • \W matches a character that's not a word character.
  • \s matches a whitespace character.
  • [^\w=] matches a non-word character other than =.

Upvotes: 4

user554546
user554546

Reputation:

You don't really need a regex, either. You can:

  • split the first string on /\s*=\s*/
  • grab the first element of the resulting array (ie all the stuff before the equals sign (with whitespace stripped off of the right end))
  • split the string from step 2 on /\s+/
  • take the last element of the resulting array from step 3.

In other words:

use strict;
use warnings;

my $str1 = "abc def = ghi";
my $str2 = "abc def ghi = jkl";
my $str3 = "abc def ghi=jkl mno";

sub grab_target{
    my $str = shift;
    return (split(/\s+/, (split(/\s*=\s*/, $str))[0]))[-1];
}

foreach  my $str ($str1, $str2, $str3){
    print grab_target $str;
    print "\n";
}

The resulting output is:

def
ghi
ghi

Upvotes: -2

alpha bravo
alpha bravo

Reputation: 7948

you could use this pattern

(\w+)(?=\s*=)

Demo

(               # Capturing Group (1)
  \w            # <ASCII letter, digit or underscore>
  +             # (one or more)(greedy)
)               # End of Capturing Group (1)
(?=             # Look-Ahead
  \s            # <whitespace character>
  *             # (zero or more)(greedy)
  =             # "="
)               # End of Look-Ahead

Upvotes: 10

Related Questions