Reputation: 2136
I am trying to figure out how to replace by a space all punctuation from a string but keeping one special character : '-'
For example, the sentence
"hi! I'm an out-of-the-box person, did you know ?"
should be transformed into
"hi I m an out-of-the-box person did you know "
I know the solution will be a single line Regex expression, but I'm really not used to "think" in Regex, so what I have tried so far is replacing all '-' by '9', then replacing all punctuation by ' ', then re-replacing all '9' by '-'. It works, but this is awful (especially if the input contains some '9' characters) :
string s = @"Hello! Hi want to remove all punctuations but not ' - ' signs ... Please help ;)";
s = s.Replace("-", "9");
s = Regex.Replace(s, @"[\W_]", " ");
s = s.Replace("9", "-");
So, can someone help me writing a Regex that only catch punctuation different from '-' ?
Upvotes: 1
Views: 67
Reputation: 31616
Place everything you consider punctuation into a set [
... ]
and look for that as a single match character in a (
... )
to be replaced. Here is an example where I seek to replace !
, .
, ,
,'
, and ?
.
string text = "hi! I'm an out-of-the-box person, did you know ?";
Console.WriteLine (
Regex.Replace(text, "([!.,'?])", " ")
);
// result:
// hi I m an out-of-the-box person did you know
Update
For the regex purist who doesn't want to specify a set one can use set subtraction. I still specify a set which searches for any non alphabetic character \W
which will match all items including the -
. But by using set subtraction -[
... ]
we can place the -
to be excluded.
Here is that example
Regex.Replace(text, @"([\W-[-]])", " ")
// result:
// hi I m an out-of-the-box person did you know
Upvotes: 0
Reputation: 73452
This regex should help. Use Character class subtraction to remove some character from character classes.
var expected = Regex.Replace(subject, @"[_\W-[\-\s]]","");
Upvotes: 2
Reputation: 27095
How about replacing matches for the following regex with a space:
[^\w\s-]|_
This says, any character that is not a word character, digit, whitespace, or dash.
Upvotes: 2
Reputation: 101681
You can do this by using Linq
:
var chars = s.Select(c => char.IsPunctuation(c) && c != '-' ? ' ' : c);
var result = new string(chars.ToArray());
Upvotes: 1