Reputation: 31778
I need a string with non alpha-numeric characters etc stripped out of it; I used the following:
wordsstr = Regex.Replace(wordsstr, "[^A-Za-z0-9,-_]", "");
The problem being dots (.)s are left in the string yet they are not specified to be kept. How could I make sure dots are gotten rid of too?
Many thanks.
Upvotes: 3
Views: 933
Reputation: 6203
Try
wordstr = Regex.Replace(wordstr, "[^A-Za-z0-9,\\-_]", "");
or better if you just want to have alpha-numerical characters:
wordstr = Regex.Replace(wordstr, "[^A-z0-9]", "");
The problem in your first regex is that the -
char defines a range, so you have to escape it to make it behave the way you want it to.
Upvotes: 1
Reputation: 2598
Try the code below:
wordsstr = Regex.Replace(wordsstr, "[^-A-Za-z0-9,_]", "");
Your problem would be easier to understand if you write your expectation and actual result.
Upvotes: 1
Reputation: 1502985
You are specifying that they need to be kept - you're using ,-_
which is everything from U+002C to U+005F, including U+002E (period).
If you meant the ,-_
to just mean comma, dash and underscore you'll need to escape the dash, such as:
wordsstr = Regex.Replace(input, @"[^A-Za-z0-9,\-_]", "");
Alternatively, (as in Oded's comment) put the dash as the first or last character in the set, to prevent it being interpreted as a range specifier:
wordsstr = Regex.Replace(input, "[^A-Za-z0-9,_-]", "");
If that's not the aim, please be more specific: "non alpha-numeric characters etc" isn't really enough information to go on.
Upvotes: 8