Kabulan0lak
Kabulan0lak

Reputation: 2136

Regex syntax in a C# application

I am trying to figure out how to replace by a space all punctuation from a string but keeping one special character : '-'

For example, the sentence

"hi! I'm an out-of-the-box person, did you know ?"

should be transformed into

"hi I m an out-of-the-box person did you know "

I know the solution will be a single line Regex expression, but I'm really not used to "think" in Regex, so what I have tried so far is replacing all '-' by '9', then replacing all punctuation by ' ', then re-replacing all '9' by '-'. It works, but this is awful (especially if the input contains some '9' characters) :

string s = @"Hello! Hi want to remove all punctuations but not ' - ' signs ... Please help ;)";
                s = s.Replace("-", "9");
                s = Regex.Replace(s, @"[\W_]", " ");
                s = s.Replace("9", "-");

So, can someone help me writing a Regex that only catch punctuation different from '-' ?

Upvotes: 1

Views: 67

Answers (4)

ΩmegaMan
ΩmegaMan

Reputation: 31616

Place everything you consider punctuation into a set [ ... ] and look for that as a single match character in a ( ... ) to be replaced. Here is an example where I seek to replace !, ., ,,', and ?.

 string text = "hi! I'm an out-of-the-box person, did you know ?";

Console.WriteLine (

Regex.Replace(text, "([!.,'?])", " ")

);

// result:
// hi  I m an out-of-the-box person  did you know

Update

For the regex purist who doesn't want to specify a set one can use set subtraction. I still specify a set which searches for any non alphabetic character \W which will match all items including the -. But by using set subtraction -[ ... ] we can place the - to be excluded.

Here is that example

Regex.Replace(text, @"([\W-[-]])", " ")

// result:
// hi  I m an out-of-the-box person  did you know 

Upvotes: 0

Sriram Sakthivel
Sriram Sakthivel

Reputation: 73452

This regex should help. Use Character class subtraction to remove some character from character classes.

var expected = Regex.Replace(subject, @"[_\W-[\-\s]]","");

Upvotes: 2

Bas
Bas

Reputation: 27095

How about replacing matches for the following regex with a space:

[^\w\s-]|_

This says, any character that is not a word character, digit, whitespace, or dash.

Upvotes: 2

Selman Genç
Selman Genç

Reputation: 101681

You can do this by using Linq:

var chars = s.Select(c => char.IsPunctuation(c) && c != '-' ? ' ' : c);

var result = new string(chars.ToArray());

Upvotes: 1

Related Questions