Reputation: 43
I am using RegEx in VBA and trying to replace any character that is not a vertical pipe within a beginning and ending vertical pipe.
In other words, my text looks like this:
++++|+++This|Is|An|Example++|++++
I only want to replace the non-pipe characters that appear after the first pipe and before the last pipe.
So, using an "a" as the replacement character for example:
++++|aaaaaaa|aa|aa|aaaaaaaaa|++++
I've tried a negative lookahead (the VBA implementation of RegEx doesn't support lookbehind), and while it works for excluding the characters after the last pipe from replacement, those before the first pipe still get replaced:
(?!^ *\|)[^\|](?! *$)
aaa|aaaaaaa|aa|aa|aaaaaaaaa|++++
I'm obviously not understanding the negative lookahead correctly because I can't seem to get those characters that appear prior to the first pipe to be excluded from the match.
Can someone please help me with this?
Upvotes: 4
Views: 166
Reputation: 4233
Here is how I solved the problem in python. I look behind for | and look ahead for | as a group. I create a capture group looking for + (0 or more) and word character (1 or more) and space (0 or more) then repeat the capture group using ?:
data="++++|+++This is bb|Is|An|Example++|++++"
replace_with="a"
matches=re.findall(r'(?<=\|)(?:\+*\w+\+*\s*)+(?=\|)', data)
for match in matches:
print(match)
data=data.replace(match,replace_with*len(match))
print(data)
output:
++++|aaaaaaaaaaaaa|aa|aa|aaaaaaaaa|++++
Upvotes: 0
Reputation: 626738
The (?!^ *\|)[^\|](?! *$)
regex matches
(?!^ *\|)
- a location that is not immediately followed with zero or more spaces and then a |
char[^\|]
- any char other than a |
char that(?! *$)
- immediately to the right, there should be no optional spaces and then the end of string.I suggest using another approach: match and capture what you need to keep and just match what you need to replace.
See the VBA code:
Dim pattern As regExp, m As Object
Dim text As String, result As String, repl As String, offset As Long
text = "++++|+++This|Is|An|Example++|++++"
repl = "a"
offset = 0
Set pattern = New regExp
With pattern
.pattern = "(^\++\||\|\++$)|[^|]"
.Global = True
End With
result = text
For Each m In pattern.Execute(text)
If Len(m.SubMatches(0)) = 0 Then ' If Group 1 matched, replace with "a"
result = Left(result, m.FirstIndex + offset) & repl & Mid(result, m.FirstIndex + m.Length + 1 + offset)
offset = offset + Len(repl) - m.Length
End If
Next
Output: ++++|aaaaaaa|aa|aa|aaaaaaaaa|++++
.
The (^\++\||\|\++$)|[^|]
regex matches
(^\++\||\|\++$)
- Group 1: one or more +
chars and then a |
at the start of string, or a |
and then one or more +
chars till the end of string|
- or[^|]
- any char other than a |
char.Upvotes: 0
Reputation: 4457
This is kind of a workaround, but instead of a complicated Regex pattern, which probably won't work in VBA, you can use a quick loop though the string and replace each section using a much simpler Regex pattern.
Sub Example()
Const InputString As String = "++++|+++This|Is|An|Example++|++++"
Debug.Print DoTheThing(InputString)
'Output: ++++|aaaaaaa|aa|aa|aaaaaaaaa|++++
End Sub
Function DoTheThing(InputString As String) As String
Dim Pieces() As String
Pieces = Split(InputString, "|")
Dim Regex As Object
Set Regex = CreateObject("VBScript.RegExp")
Regex.Global = True
Regex.Pattern = "."
If UBound(Pieces) > 1 Then
Dim i As Long
For i = 1 To UBound(Pieces) - 1
Pieces(i) = Regex.Replace(Pieces(i), "a")
Next
End If
DoTheThing = Join(Pieces, "|")
End Function
Or you can even avoid the use of the Regex Object entirely:
Sub Example()
Const InputString As String = "++++|+++This|Is|An|Example++|++++"
Debug.Print DoTheThing(InputString)
'Output: ++++|aaaaaaa|aa|aa|aaaaaaaaa|++++
End Sub
Function DoTheThing(InputString As String) As String
Dim Pieces() As String
Pieces = Split(InputString, "|")
If UBound(Pieces) > 1 Then
Dim i As Long
For i = 1 To UBound(Pieces) - 1
Pieces(i) = String(Len(Pieces(i)), "a")
Next
End If
DoTheThing = Join(Pieces, "|")
End Function
Upvotes: 1
Reputation: 18490
If data follows the pattern (always at least two vertical pipes) here an idea with optional first part:
(^[^|]*\|)?[^|](?=[^|]*\|)
See this demo at regex101 (used \n
in demo for not skipping lines) - replace with $1a
The part until the first pipe is captured to the first group $1
and inserted in replacement.
The lookahead checks after each [^|]
(negated class) if there is another |
pipe ahead.
Upvotes: 3