Rahul
Rahul

Reputation: 11530

Replace regex until no match found

How to execute regex replace until all are replaced?

For example after 4 time replacement with "(\w{3} \d{1,9})\r?\n\w{2} (\d)" by "$1$2" gives the result.

Text:

foo 1
ba 1
ba 2
ba 3
ba 4
foo 2
ba 1
ba 2
foo 3
ba 1
ba 2
ba 3

Result:

foo 11234
foo 212
foo 3123

Example code:

Dim regEx_, stxt
stxt = "foo 1" & VBcr & "ba 1" & VBcr & "ba 2" & VBcr & "ba 3" & VBcr _
  & "ba 4" & VBcr & "foo 2" & VBcr & "ba 1" & VBcr & "ba 2" & VBcr _
  & "foo 3" & VBcr & "ba 1" & VBcr & "ba 2" & VBcr & "ba 3"

Set regEx_ = New RegExp
With regEx_
  .Global = True
  .MultiLine = True
  .IgnoreCase = True
  .Pattern = "(\w{3} \d{1,9})[\r?\n]\w{2} (\d)"
  stxt = regEx_.Replace(stxt, "$1$2")
  stxt = regEx_.Replace(stxt, "$1$2")
  stxt = regEx_.Replace(stxt, "$1$2")
  stxt = regEx_.Replace(stxt, "$1$2")
  stxt = regEx_.Replace(stxt, "$1$2") 'to make sure (real example some time contains up to 30 replacements)
End With
MsgBox stxt

Is there any way I can replace until no match found? Like this:

Do Until regEx_.Test(stxt)
  stxt = regEx_.Replace(stxt, "$1$2")
Loop

Upvotes: 1

Views: 425

Answers (2)

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200233

You don't need a loop if you modify your expression a little and use a replacement function with a second regular expression to remove all non-digits from the two-letter lines:

Function Merge(m, sm1, sm2, pos, src)
  Set re = New RegExp
  re.Global  = True
  re.Pattern = "\D"

  Merge = sm1 & re.Replace(sm2, "")
End Function

Set regEx_ = New RegExp
regEx_.Global  = True
regEx_.Pattern = "(\w{3} \d{1,9})((?:[\r?\n]\w{2} \d)+)"

stxt = regEx_.Replace(stxt, GetRef("Merge"))

((?:[\r?\n]\w{2} \d)+): The modification I made to your regular expression uses a non-capturing group ((?:...)) to match one or more (+) subsequent two-letter lines. The outer set of parentheses then captures the subsequent two-letter lines in a single group that's passed as the second submatch (sm2) to the replacement function.

The replacement function uses a second regular expression to remove all non-digit characters (\D) from the two-letter lines, leaving just the digits, which are then concatenated to the first submatch (sm1, (\w{3} \d{1,9})).

Basically, a string like this:

foo 1
ba 1
ba 2
ba 3
ba 4

gives two submatches sm1:

foo 1

and sm2 (with a leading linebreak):


ba 1
ba 2
ba 3
ba 4

The replacement function then removes everything except numbers from sm2:

1234

and appends that to sm1:

Merge = "foo 1" & "1234"

Upvotes: 2

Kul-Tigin
Kul-Tigin

Reputation: 16950

You were close. Try this one.

Do While regEx_.Test(stxt)
    stxt = regEx_.replace(stxt, "$1$2")
Loop

Upvotes: 2

Related Questions