James Hill
James Hill

Reputation: 61792

RegEx behavior not as expected in C#

Background

I have just begun working with RegEx (last night).

I began matching a single [singleletter]: with the expression below:

Expression: \s+([a-z]:+)

Original string: foo u:james h c:user p:product

Output: ["foo ", "u:", "james h ", "c:", "user ", "p:", "product"]

Problem

I'm trying to modify the RegEx to capture [fullword]: instead of [sigleletter]:. Both expressions below work as desired on regexr.com, but do not work in C#. What am I doing wrong?

Option 1: [a-zA-Z]*([a-zA-Z]:+)

Option 2: \w*([a-zA-Z]:+)

Test string: foo user:james h cust:user prod:product

Desired Output: ["foo", "user:", "james h ", "cust:", "user ", "prod:", "product"]

Fullword definition: case-insensitive a-z (plus the colon)

C# that doesn't work

var foo1 = Regex.Split("cust:test", "[a-zA-Z]*([a-zA-Z]:+)");
var foo2 = Regex.Split("cust:test", "\w*([a-zA-Z]:+)");

Lastly, the first expression that currently works with `[singleletter]:' returns and empty match at the beginning for every string tested, but only in C#. Again, I feel like I'm missing something...

Upvotes: 1

Views: 67

Answers (1)

revo
revo

Reputation: 48711

Different engines work differently. You should try word boundary meta-character in addition to a little bit modification:

\b([a-zA-Z]+:)

Upvotes: 1

Related Questions