Reputation: 23
My question has 2 parts...
First, I'm trying to extract the FIRST set of numbers separated by a slash ("12/56" in this case), and ignore the 2nd set (if it exists).
Sample String:
some text 12/56 34/67 ABCD1234 --Want to grab "12/56", but ignore "34/67" more text 14/58 DEFG5678 --Want to grab "14/58".
I've tried using (\d\d\/\d\d)?
as the pattern (non-greedy), however it doesn't stop after the first hit.
Second, once the above problem is solved, I still need to grab the 8-digit code after it (there will ALWAYS be an 8-digit code). I'd like to use something like (\d\d\/\d\d)?.+([A-Z0-9]{8})
, however I'd think that the correct non-greedy search may stop regex in its tracks. Is this possible?
Upvotes: 1
Views: 829
Reputation: 92976
Just remove the ?
after the first capturing group.
(\d\d\/\d\d).+([A-Z0-9]{8})
See it here on Regexr, while hovering the mouse over the highlighted match you can see the content of the capturing groups.
Explanation:
With the ?
you don't make the group "non-greedy", you make it optional. So, because you lines doesn't start with a digit, the regex skips the optional part and match everything with the following .+
till your last part.
You don't need a "non-greedy" behaviour here, your pattern will match the first occurrence and you can make a quantifier "ungreedy" but not a group.
Upvotes: 1
Reputation: 336108
(\d\d/\d\d)\s+(?:\d\d/\d\d)?\s*([A-Z0-9]{8})
grabs the first but ignores the second set of nn/nn
strings (if present), then grabs the next 8 uppercase ASCII alnum characters, assuming that nothing but whitespace will be between those items.
The results will then be in groups 1 and 2. So, for example in Python, you'd use
reobj = re.compile(r"(\d\d/\d\d)\s+(?:\d\d/\d\d)?\s*([A-Z0-9]{8})")
match = reobj.search(subject)
if match:
first = match.group(1)
second = match.group(2)
else:
print "No match!"
Upvotes: 0
Reputation: 28687
Which language are you using these regular expressions in? Are you using a "find" or a "match" method? As long as you're using a "match" method, your last example (the "something like") should almost work as you'd expect - but I'd remove the ? after the first grouping of digits, unless you have a specific need for it:
(\d\d/\d\d).+([A-Z0-9]{8})
With using the "match" method, this will force both grouping to be populated, in order to complete a successful match.
Upvotes: 0