Soham Dasgupta
Soham Dasgupta

Reputation: 5199

C# Regex Match Vs Split for same string

I was playing around in Linqpad with a regex to extract a string. I have a few doubts that I'm sharing. Can anyone please shed some light on this matter. -

string s = "abc|xyz";
Regex.Match(s, @"(\w*)[|]{1}(\w*)").Dump();
Regex.Split(s, @"(\w*)[|]{1}(\w*)").Dump();

With Regex.Match I get back two groups which I can easily extract.

Regex.Match.

But I don't understand why in Regex.Split there are two empty entries.

Regex.Split

Upvotes: 2

Views: 1439

Answers (1)

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51330

Let's analyze your string:

abc|xyz
\_____/  <-- the match
\_/      <-- capture group 1
    \_/  <-- capture group 2

Regex.Split includes the captured groups into the resulting array.

The splits happen at the whole match, right there:

abc|xyz
\      \

So there's an empty string before the match, and an empty string after the match. The two items in the middle are inserted because of the aforementioned split behavior:

If capturing parentheses are used in a Regex.Split expression, any captured text is included in the resulting string array. For example, if you split the string "plum-pear" on a hyphen placed within capturing parentheses, the returned array includes a string element that contains the hyphen.

Upvotes: 2

Related Questions