Reputation: 1030
I am trying to extract a zip code of six numbers starting with the number 4
from a string. Right now I am using [4][0-9]{5}
, but it is also matching starting from other numbers, like 020-25468811
and it's returning 468811
. I don't want it to search in the middle of a number, only full numbers.
Upvotes: 2
Views: 1451
Reputation: 12807
Try to use the following:
(?<!\d)4\d{5}(?!\d)
I.e. find 6-digit number starting with 4 and not preceded or followed by digit.
Upvotes: 2
Reputation: 45
Your expression right now tries to match any six numbers consisting of a 4 with five numbers between 0 and 9. To fix this behavior you should add word boundaries as per Jon's suggestion.
\b[4][0-9]{5}\b
More on word boundaries here: http://www.regular-expressions.info/wordboundaries.html
Upvotes: 1
Reputation: 942
There is a start of line character in regex: ^
You could do:
^4[0-9]{5}
If the numbers are not always in the beginning of a line, you can more generally use:
\<4[0-9]{5}\>
To match only whole words. Both examples work with egrep.
Upvotes: -1
Reputation: 2182
You could simply add a space to the beginning of your regular expression " 4[0-9]{5}"
. If you need a more universal way of finding the beginning of the number (could it maybe be also be tabulator, a newline, etc?) you should have look at the predefined character class \s
. Also have a look at boundary matchers. I dont know which language you are using, but regex work very similar in most languages. Check this Java regex documentation.
Upvotes: 0