Reputation: 21
I'm trying to get a six digit number that is not surrounded by any other number, and is not in a sequence of numbers. This number can exist at the beginning of the string, anywhere in it, and at the end. It can also have commas and text in front of it, but most importantly distinct 6 digit blocks of numbers. I've pulled my hair out doing lookaheads and conditions and can't find a complete solution that solves all issues.
Sample data:
00019123211231731ORDER NO 761616 BR ADDRESS 123 A ST ORDER NO. 760641 JOHN DOE REF: ORDER #761625 OP212312165 ORDER NUMBER 759699 /REC/YR 123 A ST 766911 761223,761224,761225
Upvotes: 2
Views: 1905
Reputation: 15395
You can use a negative lookbehind and negative lookahead to make sure there are no digits adjacent to the match:
(?<!\d)\d{6}(?!\d)
This only matches the number, and not the adjacent characters.
Also, it works if the match is at the beginning or end of the string.
Upvotes: 3
Reputation: 44346
(^|\D)(\d{6})(\D|$)
. You will find your needed 6 digit match in capturing group 2. Notice that this solution is reliable only for one match. It won't find both numbers in 123456,567890
(Thank you Alan for pointing this out!). If multiple matches are needed a lookaround solution should be used.
With look-arounds:
(?<=^|\D)\d{6}(?=\D|$)
or with look-arounds and the condition to be a valid number (i.e. the first digit is not 0):
(?<=^|\D)[1-9]\d{5}(?=\D|$)
Upvotes: 5
Reputation: 3981
Couldn't you just as easily use this regex
[^0-9](\d{6})[^0-9]
It should match any 6 digit number, not padded by any other numbers. Therefore not being in a sequence.
Upvotes: -1