FieryA
FieryA

Reputation: 297

Ignore accented letters while filtering a string in C#

In the following block of code,the letters accented are not recognized (i fall into the "else")

           StringBuilder sb = new StringBuilder();
           foreach (char c in str) {
              if ((c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || c == '.' || c == '_') {
                 sb.Append(c);
              }
              else
              {
               // if c is accented, i arrive here
              }

What can i do to ignore accents? thanks for your help

Upvotes: 1

Views: 1250

Answers (2)

sstan
sstan

Reputation: 36483

Consider using char.IsLetterOrDigit(c).

Indicates whether the specified Unicode character is categorized as a letter or a decimal digit.

if (char.IsLetterOrDigit(c) || c == '.' || c == '_') {
    sb.Append(c);
}

The functions returns true for any letter, including accented ones.

Upvotes: 8

spender
spender

Reputation: 120480

How about just cleaning up the strings by removing accents and diacritics?

public string RemoveAccentsAndDiacritics(string s)
{
    return string.Concat(
        s.Normalize(NormalizationForm.FormD)
         .Where(c => System.Globalization.CharUnicodeInfo.GetUnicodeCategory(c) !=
                     System.Globalization.UnicodeCategory.NonSpacingMark));
}

Upvotes: 2

Related Questions