Reputation: 2072
I need to split the following string
the quick brown fox jumps over the lazy dog
into the following tokens:
So to explain, I want to split on the
but include the the
delimiter in the preceding array element (not as its own, separate element).
Can anyone shed any light on this or perhaps give me the correct regex?
I am using C#.
Upvotes: 3
Views: 2350
Reputation: 55589
You need to use look-behind (?<=
). The name says it all, look at the previous characters to see if they match some given pattern.
This should work:
"(?<=\\bthe) "
So, at any space, check if the previous characters were "the"
, if so, it matches.
Note - We also need to include the word boundary \\b
(escaped \b
) other-wise something like "bathe"
will also match.
Without the look-behind, we'll check all the spaces:
v v v v v v v v
the quick brown fox jumps over the lazy dog
With the look-behind, we'll only match those the have "the"
before it: (ignoring the \\b
for now)
"the "
- just found a space, and last characters are "the"
, so match.
"quick "
- just found another space, but last characters are "...k"
, so no match.
etc.
Test.
Upvotes: 4