michi
michi

Reputation: 6625

How to capture several portions of a string at once with regex?

I need to capture several strings within a longer string strText and process them. I use VBA.

strText:

Salta pax {wenn([gender]|1|orum|2|argentum)} {[firstname]} {[lastname]},  
ginhox seperatum de gloria desde quativo, 
dolus {[start]} tofi {[end]}, ([{n_night]}   
{wenn([n_night]|1|dignus|*|digni)}), cum {[n_person]} 
{wenn([n_person]|1|felix|*|semporum)}.
Quod similis beruntur: {[number]}

I'm trying to capture different portions of strText, all within the curly braces:

  1. If there's only a string within square brackets, I'd like to capture the string:

{[firstname]} --> firstname

  1. If there's a conditional operation (starting with wenn()), I'd like to capture the string within the square brackets plus the number-value-pairs after:

{[gender]|1|orum|2|argentum} --> gender / 1=orum / 2=argentum

I managed to define a pattern to get any one of the tasks above,

e.g. \{\[(.+?)\]\} capturing the strings within square brackets,
see this regex101

but I figure there must be a way to have a pattern that does all of the above?

Upvotes: 1

Views: 52

Answers (1)

Ralf S
Ralf S

Reputation: 190

I'm not sure if the following code is helpful to you. It uses the | symbol to capture both conditions.

Function extractStrings(strText As String) As MatchCollection

    Dim regEx As New RegExp
    Dim SubStrings As MatchCollection

    With regEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "(\{\[)(.+?)(\]\})|(wenn\(\[)(.+?)(\])(\|)(.+?)(\|)(.+?)(\|)(.+?)(\|)(.+?)(\)\})"
    End With

    On Error Resume Next
        Set extractStrings = regEx.Execute(strText)
    If Err = 0 Then Exit Function

    Set extractStrings = Nothing
End Function

Sub test()

    Dim strText As String
    strText = "Salta pax {wenn([gender]|1|orum|2|argentum)} {[firstname]} {[lastname]},ginhox seperatum de gloria desde quativo,dolus {[start]} tofi {[end]}, ([{n_night]} " & _
    "{wenn([n_night]|1|dignus|*|digni)}), cum {[n_person]}{wenn([n_person]|1|felix|*|semporum)}.Quod similis beruntur: {[number]}"

    Dim SubStrings As MatchCollection
    Dim SubString As Match

    Set SubStrings = extractStrings(strText)

    For Each SubString In SubStrings
        On Error Resume Next
        If SubString.SubMatches(1) <> "" Then
            Debug.Print SubString.SubMatches(1)
        Else
            Debug.Print "wenn(" & SubString.SubMatches(4) & "|" & SubString.SubMatches(7) & "=" & SubString.SubMatches(9) & "|" & SubString.SubMatches(11) & "=" & SubString.SubMatches(13) & ")"
        End If
    Next SubString

End Sub

You can iterate through all substrings with the for each loop. I am well aware, that the regex pattern is not optimal, but at least it does the trick.

Upvotes: 1

Related Questions