Reputation: 297
In the following block of code,the letters accented are not recognized (i fall into the "else")
StringBuilder sb = new StringBuilder();
foreach (char c in str) {
if ((c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || c == '.' || c == '_') {
sb.Append(c);
}
else
{
// if c is accented, i arrive here
}
What can i do to ignore accents? thanks for your help
Upvotes: 1
Views: 1250
Reputation: 36483
Consider using char.IsLetterOrDigit(c).
Indicates whether the specified Unicode character is categorized as a letter or a decimal digit.
if (char.IsLetterOrDigit(c) || c == '.' || c == '_') {
sb.Append(c);
}
The functions returns true for any letter, including accented ones.
Upvotes: 8
Reputation: 120480
How about just cleaning up the strings by removing accents and diacritics?
public string RemoveAccentsAndDiacritics(string s)
{
return string.Concat(
s.Normalize(NormalizationForm.FormD)
.Where(c => System.Globalization.CharUnicodeInfo.GetUnicodeCategory(c) !=
System.Globalization.UnicodeCategory.NonSpacingMark));
}
Upvotes: 2