Reputation: 21
I continue trying to perform string format matching using RegExp in VBScript & VB6. I am now trying to match a short, single-line string formatted as:
Seven characters:
a. Six alphanumeric plus one "-" OR
b. Five alphanumeric plus two "-"
Three numbers
Examples include 123456-789LM65F2
, 4EF789-012XY65A5
, A2345--789AB65D0
& 23456--890JK65D0
.
The RegExp pattern ([A-Z0-9\-]{12})([65][A-F0-9]{2})
lumps (1) - (3) together and finds these OK.
However, if I try to:
c) Break (3) out w/ pattern ([A-Z0-9\-]{10})([A-Z]{2})([65][A-F0-9]{2})
,
d) Break out both (2) & (3) w/ pattern ([A-Z0-9\-]{7})([0-9]{3})([A-Z]{2})([65][A-F0-9]{2})
, or
e) Tighten up (1) with alternation pattern ([A-Z0-9]{5}[-]{2}|[A-Z0-9]{6}[-]{1})([0-9]{3})([A-Z]{2})([65][A-F0-9]{2})
it refuses to find any of them.
What am I doing wrong? Following is a VBScript that runs and checks these.
' VB Script
Main()
Function Main() ' RegEx_Format_sample.vbs
'Uses two paterns, TestPttn for full format accuracy check & SplitPttn
'to separate the two desired pieces
Dim reSet, EtchTemp, arrSplit, sTemp
Dim sBoule, sSlice, idx, TestPttn, SplitPttn, arrMatch
Dim arrPttn(3), arrItems(3), idxItem, idxPttn, Msgtemp
Set reSet = New RegExp
' reSet.IgnoreCase = True ' Not using
' reSet.Global = True ' Not using
' load test case formats to check & split
arrItems(0) = "0,6 nums + 1 '-',123456-789LM65F2"
arrItems(1) = "1,6 chars + 1 '-',4EF789-012XY65A5"
arrItems(2) = "2,5 chars + 2 '-',A2345--789AB65D0"
arrItems(3) = "3,5 nums + 2 '-',23456--890JK65D0"
SplitPttn = "([A-Z0-9]{5,6})[-]{1,2}([A-Z0-9]{9})" ' split pattern has never failed to work
' load the patterns to try
arrPttn(0) = "([A-Z0-9\-]{12})([65][A-F0-9]{2})"
arrPttn(1) = "([A-Z0-9\-]{10}[A-Z]{2})([65][A-F0-9]{2})"
arrPttn(2) = "([A-Z0-9\-]{7})([0-9]{3})([A-Z]{2})([65][A-F0-9]{2})"
arrPttn(3) = "([A-Z0-9]{5}[-]{2}|[A-Z0-9]{6}[-]{1})([0-9]{3})([A-Z]{2})([65][A-F0-9]{2})"
For idxPttn = 0 To 3 ' select Test pattern
TestPttn = arrPttn(idxPttn)
TestPttn = TestPttn & "[%]" ' append % "ender" char
SplitPttn = SplitPttn & "[%]" ' append % "ender" char
For idxItem = 0 To 3
reSet.Pattern = TestPttn ' set to Test pattern
sTemp = arrItems(idxItem )
arrSplit = Split(sTemp, ",") ' arrSplit is Split array
EtchTemp = arrSplit(2) & "%" ' append % "ender" char to Item sub (2) as the "phrase" under test
If reSet.Test(EtchTemp) = False Then
MsgBox("RegEx " & TestPttn & " false for " & EtchTemp & " as " & arrSplit(1) )
Else ' test OK; now switch to SplitPttn
reSet.Pattern = SplitPttn
Set arrMatch = reSet.Execute(EtchTemp) ' run Pttn as Exec this time
If arrMatch.Count > 0 then ' If test OK then Count s/b > 0
Msgtemp = ""
Msgtemp = "RegEx " & TestPttn & " TRUE for " & EtchTemp & " as " & arrSplit(1)
For idx = 0 To arrMatch.Item(0).Submatches.Count - 1
Msgtemp = Msgtemp & Chr(13) & Chr(10) & "Split segment " & idx & " as " & arrMatch.Item(0).submatches.Item(idx)
Next
MsgBox(Msgtemp)
End If ' Count OK
End If ' test OK
Next ' idxItem
Next ' idxPttn
End Function
Upvotes: 2
Views: 175
Reputation: 21
All, tanx again for your help!!
trincot, everything in each arrItems() between the commas, incl the the "plus", is merely part of a shorthand description of each item's characteristics, such as "5 characters plus 2 dashes".
Gurman, your pttn breakdowns were helpful, but, if I read it right, the addition of the ? prefix is a "Match zero or one occurrences" and this must match exactly one occurrence. Also, my 1st pattern (matches 12) actually DID work for all my test cases.
jNevill, & JMichelB your suggestions are very close to what I ended up with.
I was "over-classing". After some tinkering, I was able to get the Test Pttn to successfully recognize these test cases by taking the [65] out of the [] in my original Alternation pattern. That is I went from ([65]) to (65) and Zammo! it worked.
Orig pattern:
([A-Z0-9]{5}[-]{2}|[A-Z0-9]{6}[-]{1})([0-9]{3})([A-Z]{2})([65][A-F0-9]{2})
Wkg pattern:
([A-Z0-9]{5}[-]{2}|[A-Z0-9]{6}[-]{1})([0-9]{3})([A-Z]{2})(65)([A-F0-9]{2})
Oh, and I moved the
SplitPttn = SplitPttn & "[%]" ' append % "ender" char
stmt up out of the For...Next loop. That helped w/ the splitting.
T-Bone
Upvotes: 0
Reputation: 10360
Try this Regex:
(?:[A-Z0-9]{6}-|[A-Z0-9]{5}--)[0-9]{3}[A-Z]{2}65[0-9A-F]{2}
Explanation:
(?:[A-Z0-9]{6}-|[A-Z0-9]{5}--)
- matches either 6 Alphanumeric characters followed by a -
or 5 Alphanumeric characters followed by a --
[0-9]{3}
- matches 3 Digits[A-Z]{2}
- matches 2 Letters65
- matches 65
literally[0-9A-F]{2}
- matches 2 HEX symbolsYou can get some idea from the following code:
VBScript Code:
Option Explicit
Dim objReg, strTest
strTest = "123456-789LM65F2" 'Change the value as per your requirements. You can also store a list of values in an array and run the code in loop
set objReg = new RegExp
objReg.Global = True
objReg.IgnoreCase = True
objReg.Pattern = "(?:[A-Z0-9]{6}-|[A-Z0-9]{5}--)[0-9]{3}[A-Z]{2}65[0-9A-F]{2}"
if objReg.test(strTest) then
msgbox strTest&" matches with the Pattern"
else
msgbox strTest&" does not match with the Pattern"
end if
set objReg = Nothing
Your patterns do not work because:
([A-Z0-9\-]{12})([65][A-F0-9]{2})
- matches 12 occurrences of either an AlphaNumeric character or -
followed by either 6 or 5 followed by 2 HEX characters
([A-Z0-9\-]{10}[A-Z]{2})([65][A-F0-9]{2})
- matches 10 occurrences of either an AlphaNumeric character or -
followed by 2 Letters followed by either 6 or 5 followed by 2 HEX characters
([A-Z0-9\-]{7})([0-9]{3})([A-Z]{2})([65][A-F0-9]{2})
- matches 7 occurrences of either an AlphaNumeric character or -
followed by 3 digits followed by 2 Letters followed by either 6 or 5 followed by 2 HEX characters
([A-Z0-9]{5}[-]{2}|[A-Z0-9]{6}[-]{1})([0-9]{3})([A-Z]{2})([65][A-F0-9]{2})
- matches either 5 occurrences of an AlphaNumeric character followed by --
or 6 occurrences of an Alphanumeric followed by a -
. This is then followed by 3 digits followed by 2 Letters followed by either 6 or 5 followed by 2 HEX characters
Upvotes: 1
Reputation: 475
Try this pattern :
(([A-Z0-9]{5}--)|([A-Z0-9]{6}-))[0-9]{3}[A-Z]{2}65[0-9A-F]{2}
Or, if the last part doesn't like the [A-F]
(([A-Z0-9]{5}--)|([A-Z0-9]{6}-))[0-9]{3}[A-Z]{2}65[0-9ABCDEF]{2}
Upvotes: 0