Israel N
Israel N

Reputation: 131

Need help in regular expression, remove all characters anyway except letters/digits and remove dots except decimal dots

I wrote this to find and remove all characters except letters, spaces, digits, percents and dots.

Regex.Replace("some string", @"[^a-zA-Z0-9\ \%\.]", "");

In fact, this find all characters except letters/spaces/digits/percents/dots, I want to change it as follows:

Finding all special characters anyway (except letters/spaces/digits/percents) and finding dot only when there are not numbers around it.

How can I do this?

Upvotes: 0

Views: 880

Answers (2)

bjfletcher
bjfletcher

Reputation: 11518

I'd remove the dot from your regex and have an additional regex applied on the string as follows:

(?<=\D)\.(?=\D)

which will delete the dot only if it doesn't have any digit to the either side of it.

If you want to delete 3. as well as the above:

(?<=\D)\.

If you want to delete .3 as well:

\.(?=\D)

If you want to delete all 1.3, 3., and .3, then apply both of the above, no need for the first one as it becomes redundant.

Explanation:

The (?<=...) and (?=...) are lookbehind and lookahead respectively, meaning it checks it's there, but won't include it in the substitution.

The \D means it's not a digit. \d means it's a digit.

The \. means it's a dot, it has to be escaped because . in regex means any character.

Upvotes: 2

jdweng
jdweng

Reputation: 34433

How about this

            string test = "abc. 1.2";
            string pattern = "([a-zA-Z])(\\.)";

            Regex expr = new Regex(pattern);
            string output = expr.Replace(test, "$1");​

Upvotes: -1

Related Questions