Adam H
Adam H

Reputation: 551

Removing accents in string before inserting into SQL database

I have written the test below to remove accented characters from a string which works. I can't show any code from the program due to company policy which is why I have written the test of the offending code.

class Program
{
    static void Main(string[] args)
    {
        String name = "Damián";

        Console.WriteLine(name);
        Console.WriteLine("");

        Console.WriteLine(removeAccents(name));
        Console.ReadLine();
    }

    static string removeAccents(string text)
    {
        return Encoding.UTF8.GetString(Encoding.GetEncoding("ISO-8859-8").GetBytes(text));
    }
}

However, when I try insert the new string into a database the accented characters re-appear in the data. I am using parameterised SqlCommand to insert the data. The accents don't exist in the data when I debug the program, only when the command has been executed. Would it be a case of changing the text encoding?

Any help on this would be greatly appreciated.

EDIT:

The above code works with removing the accents and produces the output:

Damián

Damian

_

However, when the name is entered in the database it contains the 'á' again.

Upvotes: 0

Views: 660

Answers (1)

Rich Bryant
Rich Bryant

Reputation: 907

I think your "removeAccents" function needs a little work.

Let's push it forward a couple of steps -

static string RemoveAccents(string text) 
{
    var normalized = text.Normalize(NormalizationForm.FormD);
    var builder = new StringBuilder();

    foreach (var character in normalized)
    {
        var unicodeCategory = CharUnicodeInfo.GetUnicodeCategory(character);
        if (unicodeCategory != UnicodeCategory.NonSpacingMark)
        {
            builder.Append(character);
        }
    }

    return builder.ToString().Normalize(NormalizationForm.FormC);
}

Let's see if that helps.

Upvotes: 1

Related Questions