Reputation: 6909
I have been using this regex for a while and it has been working out very well for me but on one of the strings it did not work out and is giving me strange result. Here is my code:
Dim m As Match = Regex.Match(line.Trim(), "3(?:\d{10,12}|[\d- _.]{10,16})", RegexOptions.IgnoreCase)
' If successful, write the group.
If (m.Success) Then
strTemp = m.Groups(0).Value
End If
My string in question is:
line="SOS International LLC 246326 37-115-20618- - GB S AAA 3H"
My goal is to detect and extract 37-115-20618
Normally the code above works in similar situation but this particular string seems to land an unexpected and weird result:
m.Groups(0).Value returns the following: "326 37"
Can anyone help me figure out what is wrong with my regex?
Upvotes: 1
Views: 91
Reputation: 21
A literal dash (-
) in character classes must be only at begin or in the end: [-\d _.] OR [\d _.-]. Next you can check for word boundary at beginning and in the end of your RE or check for + or other symbols:
(?:\b|(?<!\w)[+_])3(?:\d{10,12}|[-\d _.]{10,16})\b
Upvotes: 2
Reputation: 174706
Remove the space inside the character class and add a word boundary at the start and at the end.
\b3(?:\d{10,12}|[-\d_.]{10,16})\b
Now grab the string you want from group index 0. \b
matches between a word character (A-Z
, a-z
, 0-9
, _
) and a non-word character (all chars except the word character).
Upvotes: 3