James
James

Reputation: 31778

What C# regex expression can be used to strip out dots (.) in a string?

I need a string with non alpha-numeric characters etc stripped out of it; I used the following:

wordsstr = Regex.Replace(wordsstr, "[^A-Za-z0-9,-_]", "");

The problem being dots (.)s are left in the string yet they are not specified to be kept. How could I make sure dots are gotten rid of too?

Many thanks.

Upvotes: 3

Views: 933

Answers (3)

Ortwin Angermeier
Ortwin Angermeier

Reputation: 6203

Try

 wordstr = Regex.Replace(wordstr, "[^A-Za-z0-9,\\-_]", "");

or better if you just want to have alpha-numerical characters:

wordstr = Regex.Replace(wordstr, "[^A-z0-9]", "");

The problem in your first regex is that the - char defines a range, so you have to escape it to make it behave the way you want it to.

Upvotes: 1

Junichi Ito
Junichi Ito

Reputation: 2598

Try the code below:

wordsstr = Regex.Replace(wordsstr, "[^-A-Za-z0-9,_]", "");

Your problem would be easier to understand if you write your expectation and actual result.

Upvotes: 1

Jon Skeet
Jon Skeet

Reputation: 1502985

You are specifying that they need to be kept - you're using ,-_ which is everything from U+002C to U+005F, including U+002E (period).

If you meant the ,-_ to just mean comma, dash and underscore you'll need to escape the dash, such as:

wordsstr = Regex.Replace(input, @"[^A-Za-z0-9,\-_]", "");

Alternatively, (as in Oded's comment) put the dash as the first or last character in the set, to prevent it being interpreted as a range specifier:

wordsstr = Regex.Replace(input, "[^A-Za-z0-9,_-]", "");

If that's not the aim, please be more specific: "non alpha-numeric characters etc" isn't really enough information to go on.

Upvotes: 8

Related Questions