vicky
vicky

Reputation: 23

Change encoding of a text file from ANSI to UTF8 without affecting any chars of the file in C#!

Can anyone help me out? I tried a lot of different ways, but I have had no luck getting the desired result. I just want to change the encoding of an existing text[.txt] file from ANSI to UTF8 which contains chars like ö, ü etc. When I do it manually by opening that text file in edit mode and then FILE=>SAVE AS, it is shows ANSI in the Encoding list. Using this, I am able to change its Encoding from ANSI to UTF8, and it is not changing any contents/chars in this case. But when do it in using CODE, it's not working.

==> First Way I used to achieve that by following Code:

if (!System.IO.Directory.Exists(System.Windows.Forms.Application.StartupPath + "\\Temp"))
{
    System.IO.Directory.CreateDirectory(System.Windows.Forms.Application.StartupPath + "\\Temp");
}
string destPath = System.Windows.Forms.Application.StartupPath + "\\Temp\\temporarytextfile.txt";

File.WriteAllText(destPath, File.ReadAllText(path, Encoding.Default), Encoding.UTF8);

==> 2nd Alternative which I used:

using (Stream fileStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
    using (Stream destStream = new FileStream(destPath, FileMode.Create, FileAccess.Write, FileShare.ReadWrite))
    {
        using (var reader = new BinaryReader(fileStream, Encoding.Default))
        {
            using (var writer = new BinaryWriter(destStream, Encoding.UTF8))
            {
                var srcBytes = new byte[fileStream.Length];
                reader.Read(srcBytes, 0, srcBytes.Length);
                writer.Write(srcBytes);

            }
        }
    }
}

==> 3rd Alternative I used:

System.IO.StreamWriter file = new System.IO.StreamWriter(destPath, true, Encoding.Default);
using (StreamReader sr = new StreamReader(path, Encoding.UTF8, true))
{
    String line1;
    while ((line1 = sr.ReadLine()) != null)
    {
        file.WriteLine(line1);
    }
}

file.Close();

But unfortunately, none of the above solutions worked for me.

Upvotes: 1

Views: 12043

Answers (3)

merrais
merrais

Reputation: 391

I had the same need. Here is how I proceeded:

    int Encode(string file, Encoding encode)
    {
        int retour = 0;
        try
        {
            using (var reader = new StreamReader(file))
            {
                if (reader.CurrentEncoding != encode)
                {
                    String buffer = reader.ReadToEnd();
                    reader.Close();
                    using (StreamWriter writer = new System.IO.StreamWriter(file, false, encode))
                    {
                        writer.Write(buffer);
                        writer.Close();
                    }
                    message = string.Format("Encode {0} !", file);
                    retour = 2;
                }
                else retour = 1;
            }
        }
        catch(Exception e)
        {
            message = string.Format("{0} ?", e.Message);
        }
        return retour;
    }

    /// <summary>
    /// Change encoding to UTF8
    /// </summary>
    /// <param name="file"></param>
    /// <returns></returns>
    public int toUTF8(string file)
    {
        return Encode(file, Encoding.UTF8);
    }

    public int toANSI(string file)
    {
        return Encode(file, Encoding.Default);
    }

Upvotes: 1

Aneef
Aneef

Reputation: 3729

have you tried the below:

http://msdn.microsoft.com/en-us/library/system.text.encoding.convert%28v=vs.71%29.aspx

using System;
using System.Text;
namespace ConvertExample
{
   class ConvertExampleClass
   {
      static void Main()
      {
         string unicodeString = "This string contains the unicode character Pi(\u03a0)";

         // Create two different encodings.
         Encoding ascii = Encoding.ASCII;
         Encoding unicode = Encoding.Unicode;

         // Convert the string into a byte[].
         byte[] unicodeBytes = unicode.GetBytes(unicodeString);

         // Perform the conversion from one encoding to the other.
         byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);

         // Convert the new byte[] into a char[] and then into a string.
         // This is a slightly different approach to converting to illustrate
         // the use of GetCharCount/GetChars.
         char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
         ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
         string asciiString = new string(asciiChars);

         // Display the strings created before and after the conversion.
         Console.WriteLine("Original string: {0}", unicodeString);
         Console.WriteLine("Ascii converted string: {0}", asciiString);
      }
   }
}

Upvotes: -1

Guffa
Guffa

Reputation: 700242

The problem with ANSI is that it's not a specific encoding, it's just a term for "some 8-bit encoding that is the default for the system where it was created".

If the file was created on the same system, and the default encoding hasn't changed, you can just use Encoding.Default to read it, so your first and third versions would work. (Your second version just copies the file without any changes.) Otherwise you have to know exactly which encoding was used.

This example uses the windows-1250 code page:

File.ReadAllText(path, Encoding.GetEncoding(1250))

See the documentation for the Encoding class for a list of available encodings.

Upvotes: 7

Related Questions