David B
David B

Reputation: 29978

How can I match everything that is after the last occurrence of some char in a perl regular expression?

For example, return the part of the string that is after the last x in axxxghdfx445 (should return 445).

Upvotes: 22

Views: 35730

Answers (7)

benzebuth
benzebuth

Reputation: 695

the first answer is a good one, but when talking about "something that does not contain"... i like to use the regex that "matches" it

my ($substr) = $string =~ /.*x([^x]*)$/;

very usefull in some case

Upvotes: 7

Ether
Ether

Reputation: 53966

I'm surprised no one has mentioned the special variable that does this, $': "$'" returns everything after the matched string. (perldoc perlre)

my $str = 'axxxghdfx445';
$str =~ /x/;

# $' contains '445';
print $';

However, there is a cost (emphasis mine):

WARNING: Once Perl sees that you need one of $&, "$", or "$'" anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program. Perl uses the same mechanism to produce $1, $2, etc, so you also pay a price for each pattern that contains capturing parentheses. (To avoid this cost while retaining the grouping behaviour, use the extended regular expression "(?: ... )" instead.) But if you never use $&, "$" or "$'", then patterns without capturing parentheses will not be penalized. So avoid $&, "$'", and "$`" if you can, but if you can't (and some algorithms really appreciate them), once you've used them once, use them at will, because you've already paid the price. As of 5.005, $& is not so costly as the other two.

But wait, there's more! You get two operators for the price of one, act NOW!

As a workaround for this problem, Perl 5.10.0 introduces "${^PREMATCH}", "${^MATCH}" and "${^POSTMATCH}", which are equivalent to "$`", $& and "$'", except that they are only guaranteed to be defined after a successful match that was executed with the "/p" (preserve) modifier. The use of these variables incurs no global performance penalty, unlike their punctuation char equivalents, however at the trade-off that you have to tell perl when you want to use them.

my $str = 'axxxghdfx445';
$str =~ /x/p;

# ${^POSTMATCH} contains '445';
print ${^POSTMATCH};

I would humbly submit that this route is the best and most straight-forward approach in most cases, since it does not require that you do special things with your pattern construction in order to retrieve the postmatch portion, and there is no performance penalty.

Upvotes: 4

FMc
FMc

Reputation: 42411

Yet another way to do it. It's not as simple as a single regular expression, but if you're optimizing for speed, this approach will probably be faster than anything using regex, including split.

my $s     = 'axxxghdfx445';
my $p     = rindex $s, 'x';
my $match = $p < 0 ? undef : substr($s, $p + 1);

Upvotes: 4

ghostdog74
ghostdog74

Reputation: 342313

the simplest way is not regular expression, but a simple split() and getting the last element.

$string="axxxghdfx445";
@s = split /x/ , $string;
print $s[-1];

Upvotes: 6

Nikhil Jain
Nikhil Jain

Reputation: 8332

Regular Expression : /([^x]+)$/ #assuming x is not last element of the string.

Upvotes: 2

reko_t
reko_t

Reputation: 56430

The simplest way would be to use /([^x]*)$/

Upvotes: 19

Eugene Yarmash
Eugene Yarmash

Reputation: 149736

my($substr) = $string =~ /.*x(.*)/;

From perldoc perlre:

By default, a quantified subpattern is "greedy", that is, it will match as many times as possible (given a particular starting location) while still allowing the rest of the pattern to match.

That's why .*x will match up to the last occurence of x.

Upvotes: 25

Related Questions