Reputation: 5199
I was playing around in Linqpad with a regex to extract a string. I have a few doubts that I'm sharing. Can anyone please shed some light on this matter. -
string s = "abc|xyz";
Regex.Match(s, @"(\w*)[|]{1}(\w*)").Dump();
Regex.Split(s, @"(\w*)[|]{1}(\w*)").Dump();
With Regex.Match
I get back two groups which I can easily extract.
.
But I don't understand why in Regex.Split
there are two empty entries.
Upvotes: 2
Views: 1439
Reputation: 51330
Let's analyze your string:
abc|xyz
\_____/ <-- the match
\_/ <-- capture group 1
\_/ <-- capture group 2
Regex.Split
includes the captured groups into the resulting array.
The splits happen at the whole match, right there:
abc|xyz
\ \
So there's an empty string before the match, and an empty string after the match. The two items in the middle are inserted because of the aforementioned split behavior:
If capturing parentheses are used in a
Regex.Split
expression, any captured text is included in the resulting string array. For example, if you split the string "plum-pear" on a hyphen placed within capturing parentheses, the returned array includes a string element that contains the hyphen.
Upvotes: 2