Reputation: 96797
Assume the following strings:
A01B100
A01.B100
A01
A01............................B100
( whatever between A and B )The thing is, the numbers should be \d+
, and in all of the strings A will always be present, while B may not. A will always be followed by one or more digits, and so will B, if present. What regex could I use to capture A and B's digit?
I have the following regex:
(A(\d+)).*?(B?(\d+)?)
but this only works for the first and the third case.
Upvotes: 1
Views: 144
Reputation: 14086
A
precede B
? Assuming yes.B
appear more than once? Assuming no. B
appear except as part of a B
-number group? Assuming no.Then,
A\d+.*?(B\d+)?
using the lazy .*? or
A\d+[^B]*(B\d+)?
which is more efficient but requires that B
be a single character.
EDIT: Upon further reflection, I have parenthesized the patterns in a less-than-perfect way. The following patterns should require fewer assumptions:
A\d+(.*?B\d+)?
a\d+([^B]*B\d+)?
Upvotes: 3
Reputation: 18292
A\d+.*(B\d+)?
OK, so that provides something which passes all test cases... BUT it has some false positives.
A\d+(.*B\d+)?
It seems other characters should only appear if B(whatever) is after them, so use the above instead.
#perl test case hackup
@array = ('A01B100', 'A01.B100', 'A01', 'A01............................B100', 'A01FAIL', 'NEVER');
for (@array) {
print "$_\n" if $_ =~ /^A\d+(.*B\d+)?$/;
}
Upvotes: 0
Reputation: 101671
import re
m = re.match(r"A(?P<d1>\d+)\.*(B(?P<d2>\d+))?", "A01.B100")
print m.groupdict()
Upvotes: 0
Reputation: 1324073
(?ms)^A(\d+)(?:[^\n\r]*B(\d+))?$
Assuming one string per line:
the [^\n\r]* is a non-greedy match for any characters (except newlines) after Axx, meaing it could gobble an intermediate Byy before the last B:
A01...B01...B23
would be matched, with 01 and 23 detected.
Upvotes: 1