Mac
Mac

Reputation: 517

Regular expression to substitute a pattern in VB script

I am trying to write a regular expression in VB script to substitute some patterns.

My string may contain zero or more following patterns -

&USERID
&USERID
&USERID.
&USERID.
&USERID(n)
&USERID(n)
&USERID(n).
&USERID(n).
&USERID(n1, n2)
&USERID(n1, n2)
&USERID(n1, n2).
&USERID(n1, n2).

Sample string -

C:\temp\&USERID_&USERID._&USERID(4)_&USERID(2,2)..txt

If USERID=ABCDEF, then once substituted the resultant string should look like -

C:\temp\ABCDEF_ABCDEF_ABCD_BC.txt

The number in the bracket denotes the number of characters to substitute. Range can be specified using comma separated numbers. In order to achieve this I wrote a regular expression as follows -

"((&USERID\(\d+,\d+\)\.)|(&USERID\(\d+,\d+\)\.)|(&USERID\(\d+,\d+\))|(&USERID\(\d+,\d+\)))|((&USERID\(\d\)\.)|(&USERID\(\d\)\.)|(&USERID\(\d\))|(&USERID\(\d\))|(&USERID\.)|(&USERID\.))"

Using VBScript.RegExp I match the pattern and obtain collection of the matches. Iterating over each match object, I substitute either the complete USERID or part of it based on subscript.

The regular expression works fine. BUT it is very slow compared to string manipulation function.

Can above pattern be optimized?

Update:

I accepted the answer which solves one of my problem. Based on the regular expression, I tried to solve another find and replace problem as follows -

I have following patterns

DATE
DATE(MMDDYYYY)
DATE(DDMMYYYY)
DATE(YYYYMMDD)
DATE(YYYY)
DATE(MM)
DATE(DD)
DATE(DDMONYYYY)
DATE(MON)
DATE(MONTH)
DATE(YYDDD)
DATE(YYYYDDD)

It may have a terminating "." at the end.

Function replaceDate(matchString, label, position, sourceString)
If label = "MMDDYYYY" or label = "" then
    replaceDate = "<MMDDYYYY>"
ElseIf label = "DDMMYYYY" then 
    replaceDate = "<DDMMYYYY>"
ElseIf label = "YYYYMMDD" then 
    replaceDate = "<YYYYMMDD>"
ElseIf label = "YYYY" then 
    replaceDate = "<YYYY>"
ElseIf label = "MM" then 
    replaceDate = "<MM>"
ElseIf label = "DD" then 
    replaceDate = "<DD>"
ElseIf label = "DDMONYYYY" then 
    replaceDate = "<DDMONYYYY>"
ElseIf label = "MON" then 
    replaceDate = "<MON>"
ElseIf label = "MONTH" then 
    replaceDate = "<MONTH>"
ElseIf label = "YYDD" then 
    replaceDate = "<YYYYDDD>"
Else
    replaceDate = ""
end if  
End Function


With new RegExp
    .Global = True 
    .IgnoreCase = False
    .Pattern =  "(?:&(?:amp;)?)?DATE(?:\((MMDDYYYY|DDMMYYYY|YYYYMMDD|YYYY|MM|DD|DDMONYYYY|MON|MONTH|YYDDD|YYYYDDD)?\))?\.?"
    strTempValue = .Replace(strTempValue, GetRef("replaceDate"))
End with

Upvotes: 0

Views: 420

Answers (1)

MC ND
MC ND

Reputation: 70923

Without more data it is not easy to test, but you can try if this performs better

Dim USERID
    USERID = "ABCDEF"

Dim originalString    
    originalString = "C:\temp\&USERID_&amp;USERID._&USERID(4)_&USERID(2,2)..txt"

Dim convertedString

    Function replaceUSERID(matchString, n1, n2, position, sourceString)
        n1 = CLng("0" & Trim(n1))
        n2 = CLng("0" & Trim(Replace(n2, ",", "")))
        If n1 < 1 Then 
            replaceUSERID = USERID
        ElseIf n2 > 0 Then 
            replaceUSERID = Mid(USERID, n1, n2)
        Else 
            replaceUSERID = Left(USERID, n1)
        End If 
    End Function 

    With New RegExp
        .Pattern = "(?:&(?:amp;)?)?USERID(?:\((\s*\d+\s*)(,\s*\d+\s*)?\))?\.?"
        .Global = True 
        .IgnoreCase = False
        convertedString = .Replace(originalString, GetRef("replaceUSERID"))
    End With 

    WScript.Echo originalString
    WScript.Echo convertedString

For a multiple "label" replacement

Option Explicit

Dim dicLabels
    Set dicLabels = WScript.CreateObject("Scripting.Dictionary")

    With dicLabels
        .Add "USERID", "ABCDEF"
        .Add "LUSER", "ABCDEF"
        .Add "ID", "GHIJKL"
    End With

Dim originalString    
    originalString = "C:\temp\&USERID_&amp;USERID._&USERID(4)_&USERID(2,2)_ID(2,3)_&amp;LUSER..txt"

Dim convertedString

    Function replaceLabels(matchString, label, n1, n2, position, sourceString)
        If Not dicLabels.Exists(label) Then 
            replaceLabels = matchString
        Else 
            n1 = CLng("0" & Trim(n1))
            n2 = CLng("0" & Trim(Replace(n2,",","")))
            replaceLabels = dicLabels.Item(label)
            If n1 > 0 Then 
                If n2 > 0 Then 
                    replaceLabels = Mid(dicLabels.Item(label), n1, n2)
                Else 
                    replaceLabels = Left(dicLabels.Item(label), n1)
                End If 
            End If
        End If
    End Function 

    With New RegExp
        .Pattern = "(?:&(?:amp;)?)?("& Join(dicLabels.Keys, "|") &")(?:\((\s*\d+\s*)(,\s*\d+\s*)?\))?\.?"
        .Global = True 
        .IgnoreCase = False
        convertedString = .Replace(originalString, GetRef("replaceLabels"))
    End With 

    WScript.Echo originalString
    WScript.Echo convertedString

Upvotes: 1

Related Questions