B Hart
B Hart

Reputation: 1118

RegEx Find all occurrences and instances of multiple words if they exist in a string

This is a continuation of RegEx Only Return matches if words are present between two words

I'm trying to use RegEx with vbScript or VBA and find all occurrences of specific words in a string. This string comes from a large config file and contains other data but I can parse out the blocks of string that I need with another RegEx command. In my test routine below, it finds the first occurrence of each OR'd word and stops. I'm trying to return all occurrences and instances of each word found if they exist within the string. I just can't seem to figure out how to make it loop or continue checking...

I've also made a RegEx Tester Link for the below: http://regex101.com/r/zP8aT3

Sub TestRegEx_1()
Dim TestString, X
Dim objRegEx, f_objResults, f_Match

TestString = "edit GoodMatch1 ;mode" & vbCrLf & _
    "Something Random" & vbCrLf & _
    "KeyWord_2 A B and C and also D E" & vbCrLf & _
    "Something Random" & vbCrLf & _
    "Something Random" & vbCrLf & _
    "KeyWord_3 1 A and 2 B" & vbCrLf & _
    "Something Random" & vbCrLf & _
    "KeyWord_1 1 2 and 3 and also 4 5" & vbCrLf & _
    "Something Random" & vbCrLf & _
    "exit"

Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.MultiLine = True
objRegEx.Global = True

objRegEx.Pattern = "^edit\s(.*?)\s\;mode[\S\s]*?(?=.*?\b(KeyWord_1|KeyWord_2|KeyWord_3|NonExistant_1)\b)(?=.*?\b(1|2|3|A|B|C|8|9|10|X|Y\Z)\b)[\S\s]*?exit$"
Set f_objResults = objRegEx.Execute(TestString)
For Each f_Match In f_objResults
    'MsgBox f_Match
    For Each X In f_Match.submatches
        MsgBox X
    Next
Next

End Sub

What I'm trying to achieve would be something like the below:

'Expected f_Match.SubMatches Output in a loop
'GoodMatch1
'KeyWord_2
'A
'B
'C
'KeyWord_3
'1
'A
'2
'B
'KeyWord_1
'1
'2
'3

Or something similar and workable... Please let me know if any additional information is needed. Any help is greatly appreciated. Thank You!

Upvotes: 1

Views: 3745

Answers (1)

Jerry
Jerry

Reputation: 71538

Well, if this time you don't mind not capturing the whole block, you can use a modification fo the first two regex I had written for your previous question:

(?:(?:edit (\S+))|(KeyWord_1|KeyWord_2|KeyWord_3)|\b([0-9A-Z])\b)(?=(?:(?!edit[^;]+;mode )[\s\S])*?exit)

regex101 demo

(?:(?:edit (\S+))|(KeyWord_1|KeyWord_2|KeyWord_3)|\b([0-9A-Z])\b)

Is a big group, broken into 3 possible matches:

(?:edit (\S+)) to get the edit name.

(KeyWord_1|KeyWord_2|KeyWord_3) to get the keywords

\b([0-9A-Z])\b to get the letters/numbers

And I think you can use the 'whole block' regex of before to first extract the block, and this one to get the individual keywords. That 'block regex' unfortunately cannot do individual captures because overlapping matches are not allowed and since you are already matching the whole block, you'll need to have one capture group for each part you want to get, which isn't very practical... unless there's a method to do it which I don't know yet. =P

Upvotes: 1

Related Questions