Reputation: 855
I have been trying to build a regular expression but haven't been able to get one specific condition to work.
I want a regex to remove all non alpha characters with the exception of dash (-
). Dashes should only be replaced if they are prefixed by a space.
I.e.
TEST-TEST -TEST#TEST.TEST
should be changed to
TEST-TEST TEST TEST TEST
I had been using [^a-zA-Z0-9]
but haven't been able to include one OR condition init.
Upvotes: 2
Views: 5024
Reputation: 6343
// Skip over '-', grab non-word characters or the ' -' sequence to replace
string pattern = @"(?!-)(\W| -)+";
string replacement = "";
Regex regex = new Regex(pattern);
string result = regex .Replace("Replace - this *@#&@#* string-already", replacement);
The (?!-) is a zero-width negative lookahead assertion that will skip over the '-' symbol... the second group will match it if it's preceded by a space.
If you're trying to substitute a space instead of completely removing the characters, just change to
string replacement = " ";
the pattern is greedy, so it will replace groups of non-word characters with a single space.
Upvotes: 2
Reputation: 86
Here is what I came up with (\s-|[^A-Za-z0-9-])
... It will remove all non alphanumerics but keep the "-" except if there is a space before it " -"
Test using sed in Linux, at the moment I don't have access to VS or Mono to test in C#
echo "TEST-TEST -TEST#TEST.TEST -1234" | sed 's/\(\s-\|[^A-Za-z0-9-]\)/ /g'
Output
TEST-TEST TEST TEST TEST 1234
\s-
[^A-Za-z0-9-]
Upvotes: 3