Reputation: 547
I have struggled with this expression for 2 days now so I thought I'd ask for some proper help from the world of knowledge. I hope someone can help.
This is the RegEx I built to get me what I want.
\S*\d*?-[A-Z]*[0-9]*
I only want the Uppercase Letters and Numbers with dashes, so it does get GC-113
, AO-1-GC-113
, AO-2-GC-113
, which is great!
"I don't want this ------
, but this is good GC-113
, AO-1-GC-113
, AO-2-GC-113
"
BUT if I come across one where there is no space between the number, but just another character like a comma or a period then it returns a match on the entire section "GC-113,AO-1-GC-113,AO-2-GC-113
"
"I don't want this ------
, but this is good GC-113,AO-1-GC-113,AO-2-GC-113
"
I'm using RegExBuddy to try and figure this out.
This is the VBA code I'm using get the matches.
Public Function GetRIs(ByVal vstrInString As String) As Collection
Dim myRegExp As RegExp
Dim myMatches As Variant
Dim myMatch As Variant
Set GetRIs = New Collection
Set myRegExp = New RegExp
myRegExp.Global = True
myRegExp.Pattern = "\S*\d*?-[A-Z]*[0-9]*"
Set myMatches = myRegExp.Execute(vstrInString)
For Each myMatch In myMatches
If myMatch.Value <> "" Then
GetRIs.Add myMatch.Value
End If
Next
End Function
Thanks! Dave
Upvotes: 1
Views: 6677
Reputation: 627380
Your \S*\d*?-[A-Z]*[0-9]*
pattern can even match a single hyphen as only -
is obligatory and the rest of subpatterns can match zero times (can be absent from the string).
You can use
myRegExp.Pattern = "\b[A-Z0-9]+(?:-[A-Z0-9]+)+"
The pattern matches:
\b
- a word boundary (before the next letter or digit there must be a non-word character or start of string[A-Z0-9]+
- one or more letters or digits(?:-[A-Z0-9]+)+
- 1 or more sequences of:
-
- a hyphen[A-Z0-9]+
- one or more letters or digitsUpvotes: 2