ChandlerPelhams
ChandlerPelhams

Reputation: 1678

Regex split for words containing periods

Can someone tell me how to modify this regex to allow periods in a string?

string[] parts = Regex.Split(s, @"\b|[^\.#_a-zA-Z0-9()=><!%]");

If I provide the string: "HELLO ABC.123"

This regex is returning {"HELLO", "ABC", ".", "123"}

I want to return {"HELLO", "ABC.123"}

Please forgive my noobishness for regex patterns.

EDIT: I am using C# 3.5

Upvotes: 4

Views: 810

Answers (2)

tofcoder
tofcoder

Reputation: 2382

Just remove \b from \b|[^\.#_a-zA-Z0-9()=><!%], and use:

string[] parts = Regex.Split(s, @"[^#_a-zA-Z0-9()=><!%]");

Upvotes: 1

Qtax
Qtax

Reputation: 33928

\b matches on both sides of the period in ABC.123. You can change it to avoid that. For example:

(?<![\w.])(?=[\w.])|(?<=[\w.])(?![\w.])

Giving the complete quoted expression:

@"(?<![\w.])(?=[\w.])|(?<=[\w.])(?![\w.])|[^\w.#()=><!%]+"

You may want to add the #()=><!% characters to all the character classes.

Upvotes: 3

Related Questions