Reputation: 517
I am trying to write a regular expression in VB script to substitute some patterns.
My string may contain zero or more following patterns -
&USERID
&USERID
&USERID.
&USERID.
&USERID(n)
&USERID(n)
&USERID(n).
&USERID(n).
&USERID(n1, n2)
&USERID(n1, n2)
&USERID(n1, n2).
&USERID(n1, n2).
Sample string -
C:\temp\&USERID_&USERID._&USERID(4)_&USERID(2,2)..txt
If USERID=ABCDEF, then once substituted the resultant string should look like -
C:\temp\ABCDEF_ABCDEF_ABCD_BC.txt
The number in the bracket denotes the number of characters to substitute. Range can be specified using comma separated numbers. In order to achieve this I wrote a regular expression as follows -
"((&USERID\(\d+,\d+\)\.)|(&USERID\(\d+,\d+\)\.)|(&USERID\(\d+,\d+\))|(&USERID\(\d+,\d+\)))|((&USERID\(\d\)\.)|(&USERID\(\d\)\.)|(&USERID\(\d\))|(&USERID\(\d\))|(&USERID\.)|(&USERID\.))"
Using VBScript.RegExp I match the pattern and obtain collection of the matches. Iterating over each match object, I substitute either the complete USERID or part of it based on subscript.
The regular expression works fine. BUT it is very slow compared to string manipulation function.
Can above pattern be optimized?
Update:
I accepted the answer which solves one of my problem. Based on the regular expression, I tried to solve another find and replace problem as follows -
I have following patterns
DATE
DATE(MMDDYYYY)
DATE(DDMMYYYY)
DATE(YYYYMMDD)
DATE(YYYY)
DATE(MM)
DATE(DD)
DATE(DDMONYYYY)
DATE(MON)
DATE(MONTH)
DATE(YYDDD)
DATE(YYYYDDD)
It may have a terminating "." at the end.
Function replaceDate(matchString, label, position, sourceString)
If label = "MMDDYYYY" or label = "" then
replaceDate = "<MMDDYYYY>"
ElseIf label = "DDMMYYYY" then
replaceDate = "<DDMMYYYY>"
ElseIf label = "YYYYMMDD" then
replaceDate = "<YYYYMMDD>"
ElseIf label = "YYYY" then
replaceDate = "<YYYY>"
ElseIf label = "MM" then
replaceDate = "<MM>"
ElseIf label = "DD" then
replaceDate = "<DD>"
ElseIf label = "DDMONYYYY" then
replaceDate = "<DDMONYYYY>"
ElseIf label = "MON" then
replaceDate = "<MON>"
ElseIf label = "MONTH" then
replaceDate = "<MONTH>"
ElseIf label = "YYDD" then
replaceDate = "<YYYYDDD>"
Else
replaceDate = ""
end if
End Function
With new RegExp
.Global = True
.IgnoreCase = False
.Pattern = "(?:&(?:amp;)?)?DATE(?:\((MMDDYYYY|DDMMYYYY|YYYYMMDD|YYYY|MM|DD|DDMONYYYY|MON|MONTH|YYDDD|YYYYDDD)?\))?\.?"
strTempValue = .Replace(strTempValue, GetRef("replaceDate"))
End with
Upvotes: 0
Views: 420
Reputation: 70923
Without more data it is not easy to test, but you can try if this performs better
Dim USERID
USERID = "ABCDEF"
Dim originalString
originalString = "C:\temp\&USERID_&USERID._&USERID(4)_&USERID(2,2)..txt"
Dim convertedString
Function replaceUSERID(matchString, n1, n2, position, sourceString)
n1 = CLng("0" & Trim(n1))
n2 = CLng("0" & Trim(Replace(n2, ",", "")))
If n1 < 1 Then
replaceUSERID = USERID
ElseIf n2 > 0 Then
replaceUSERID = Mid(USERID, n1, n2)
Else
replaceUSERID = Left(USERID, n1)
End If
End Function
With New RegExp
.Pattern = "(?:&(?:amp;)?)?USERID(?:\((\s*\d+\s*)(,\s*\d+\s*)?\))?\.?"
.Global = True
.IgnoreCase = False
convertedString = .Replace(originalString, GetRef("replaceUSERID"))
End With
WScript.Echo originalString
WScript.Echo convertedString
For a multiple "label" replacement
Option Explicit
Dim dicLabels
Set dicLabels = WScript.CreateObject("Scripting.Dictionary")
With dicLabels
.Add "USERID", "ABCDEF"
.Add "LUSER", "ABCDEF"
.Add "ID", "GHIJKL"
End With
Dim originalString
originalString = "C:\temp\&USERID_&USERID._&USERID(4)_&USERID(2,2)_ID(2,3)_&LUSER..txt"
Dim convertedString
Function replaceLabels(matchString, label, n1, n2, position, sourceString)
If Not dicLabels.Exists(label) Then
replaceLabels = matchString
Else
n1 = CLng("0" & Trim(n1))
n2 = CLng("0" & Trim(Replace(n2,",","")))
replaceLabels = dicLabels.Item(label)
If n1 > 0 Then
If n2 > 0 Then
replaceLabels = Mid(dicLabels.Item(label), n1, n2)
Else
replaceLabels = Left(dicLabels.Item(label), n1)
End If
End If
End If
End Function
With New RegExp
.Pattern = "(?:&(?:amp;)?)?("& Join(dicLabels.Keys, "|") &")(?:\((\s*\d+\s*)(,\s*\d+\s*)?\))?\.?"
.Global = True
.IgnoreCase = False
convertedString = .Replace(originalString, GetRef("replaceLabels"))
End With
WScript.Echo originalString
WScript.Echo convertedString
Upvotes: 1