Reputation: 1034
I have the following input text:
A B C D E \F G H I JKL \M
and I would like to mach all characters without a \ as prexix, each of the characters individually. So basically, as a match, I'd like to get A, B, C, D, E, G, H, I, J, K and L, with F and M not passing because they are prefixed/escaped.
I got as far as
([^\\]([A-Z]{1}))
which works but not exactly as expected:
- A
is ignored, because there is nothing before (and I am testing for anything but the backslash)
- each letter is matched with the space before
- JKL
is matches as J
with a space before, and KL
as one string.
I have tried different other variations with parantheses but was not successful with that.
Upvotes: 2
Views: 35
Reputation: 627101
The negated character class [^\\]
is a consuming pattern that matches the text, adds it to the match value and advances the regex index to the end of the match.
Use a non-consuming negative lookbehind:
(?<!\\)[A-Z]
^^^^^^^
See the regex demo. Being a non-consuming pattern, the (?<!\\)
only checks if there is a backslash before an ASCII uppercase letter, and if there is any, the engine fails the match. If there is a \
, the letter is matched (while the backslash remains missing in the match value).
C# code:
var results = Regex.Matches(s, @"(?<!\\)[A-Z]")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
Upvotes: 2