Reputation: 8502
I need to create a regex that will match this expression:
replace:sub\:str:new\:Substr
I have to be careful about not matching other similar looking strings though. For example, this a different match:
slice:fromIndex[:toIndex]
Specifically:
replace:
. If it does not, then nothing should match.\:
but not unescaped colons: :
sub\:str
and new\:Substr
.replace:<subString>:<replacementString>
. However, both the subString and the replacementString can have escaped colons :
, which is why the example includes them. I've been unable to come up with a solution. While I'm not an expert at Regex, I'm normally pretty competent. But so far I've only been able to either ignore replace:
and simply match on (?<=\:)(?:\\:|[^:])+
to include both substrings, but I end up matching other patterns as well. If I change the look behind to (?<=replace:)
I only match the first substring. I just can't figure out how to get it to also match that second substring without including the :
separator. I suspect I need to nest the expression somehow but I've been completely unsuccessful at it.
Note: I can solve this in the language. I can simply check if the string has the prefix replace:
as a separate check. But I'd really like to do the match completely in Regex if it's possible.
replace:sub\:str:new\:Substr
matches: sub\:str
, new\:Substr
replace:subString:replacment
matches: subString
, replacement
replace:UserId:user\:ID
matches: UserId
, user:ID
replace:UserName:Aaron Hayman
matches: UserName
, Aaron Hayman
replace:userId:uid90809y087
matches: userId
, uid90809y087
rep:userId:user
matches: none
replace:UserName
matches: noneslice:908:1098
matches: noneThis should give you an example. As background, after this string is parsed, it would be applied as a kind of filter for another template string.
Upvotes: 0
Views: 3090
Reputation: 91428
How about:
^replace:(\w+\\:\w+):(\w+\\:\w+)
The first group will contain sub\:str
and the second new\:Substr
New version according to OP's edit:
^replace:([^:]+(?:\\:)?[^:]+):([^:]+(?:\\:)?[^:]+)
It works for all given test cases
If you don't want replace
in the whole match, put it in lookbehind:
(?<=^replace:)([^:]+(?:\\:)?[^:]+):([^:]+(?:\\:)?[^:]+)
Upvotes: 0
Reputation: 626932
The regex that will match all escape sequences you may have in a C string literal will look like
replace:([^:\\]*(?:\\.[^:\\]*)*):([^:\\]*(?:\\.[^:\\]*)*)
See the regex demo
NOTE: If it must appear at the start of the string, add ^
at the pattern staet.
Details:
replace:
- a literal char sequence([^:\\]*(?:\\.[^:\\]*)*)
- Capturing group 1 matching
[^:\\]*
- 0+ chars other than :
and \
(?:\\.[^:\\]*)*
- zero or more sequences of:
\\.
- any escaped char (a \
and any char)[^:\\]*
- 0+ chars other than :
and \
:
- an unescaped :
([^:\\]*(?:\\.[^:\\]*)*)
- see above.Upvotes: 1
Reputation: 15784
Quite convoluted, but you can nest lookarounds:
replace:(.+?(?!(?<=\\):)):(.+(?!(?<=\\):))
It will ensure that after replace:
any character is not followed by a :
not itself preceded by a \
Drawback:
In case of 3 parts (a third not escaped :
), The second part will include everything, see the demo for what I mean.
Upvotes: 0