Reputation: 123
I have a perl regex that i'm fairly certain should work (perl) but is being too greedy:
regex:
(?:.*serial[^\d]+?(\d+).*)
Test string:
APPLICATIONSERIALNO123456Plnsn123456te20140728tdrnserialnun12hou
Desired group 1 match:
123456
Actual group 1 Match:
12
I've tried every permutation of lookahead and behind and laziness and I can't get the damn thing to work.
WHAT AM I MISSING.
Thanks!
Upvotes: 1
Views: 277
Reputation: 386206
The problem is not greediness; it's case-sensitivity.
Currently your regex matches the 12
at the end of serialnun12
because those are the only digits following serial
. The ones you want follow SERIAL
. S
and s
are different characters.
There are two solution.
Use the uppercase characters in the pattern.
my ($serial) = $string =~ /SERIAL\D*(\d+)/;
Use case-insensitive matching.
my ($serial) = $string =~ /serial\D*(\d+)/i;
There's probably no need for this, but I thought I'd mention it just in case.
Upvotes: 3
Reputation: 41838
Currently your regex matches the 12
at the end of serialnun12
, probably because it is case-sensitive. We have two options: using upper-case, or making the pattern case-insensitive.
Option 1: Use Upper-Case
If you only want 123456
, you can use:
SERIALNO\K\d+
The \K
tells the engine to drop what was matched so far from the final match it returns.
If you want to match the whole string and capture 123456
to Group 1, use:
.*?SERIAL\D+(\d+).*
Option 2: Turning Case-Sensitivity On using (?i)
inline or the i
flag
To only match 123456
, you can use:
(?i)serial\D+\K\d+
Note that if you use the g
flag, this would match both numbers.
If you want to match the whole string and capture 123456
to Group 1, use:
(?i).*?serial\D+(\d+).*
A few tips
(?i)
inline modifier or the i
flag at the end of the pattern: /serial\D+\K\d+/i
[^\d]
, use \D
\D+\d+
because the two tokens are mutually exclusive: there is no danger that the \D
will run over the \d
Upvotes: 4