Siegfried.V
Siegfried.V

Reputation: 1595

StreamWriter Encoding.Default uses different encodings?

I am trying to export a file to an old programm that doesn't recognise Unicode(all my database uses utf8_unicode_ci encoding).

When I export the file, I then use Encoding.Default.

using (StreamWriter sw = new StreamWriter(parcours + "2", false, Encoding.Default))
   {
      foreach (string st in output)
      {
         sw.WriteLine("{0}", st);
      }
   }

But what is strange, is in some cases the file is correctly read, and in other cases not, but I use exactly the same function.

When I open with Notepad++, I can see that the file working is in ANSI, and the one not working is in Macintosh.

How can I always export ANSI? I guess using a Default value makes it to change encoding by itself?

nota : Here It is said that "ANSI" in notepad, just means it is not unicode, so I don't know if I can trust notepad's information?

Edit : As suggested by CodeCaster I used Windows-1251 Encoding, and I am back to the initial point, but at least I know that Encoding is where the error is?

Honestly I don't understand, in debug mode all the text is correct in my List. But in some cases the code is correctly encoded, in some cases not. Concretely here is what I mean by "works" :

ДВУТАВР20К2 is written ДВУТАВР20К2 in file (it works).

Двутавр12б1 is written ƒ¬”“ј¬–12Ѕ1 in file (doesn't work).

in string, there is no encoding as much as I know, so how could I explain that?

Upvotes: 1

Views: 2371

Answers (2)

Ohad Bitton
Ohad Bitton

Reputation: 515

From looking at .NET Encoding code

Calling Encoding.Default asks the OS for its windows embedded encoding, most likely UTF-8. The page suggests that you use UTF-8 or UTF-16 when possible (most likely the first one). Try this post if you want to read more.

Upvotes: 0

CodeCaster
CodeCaster

Reputation: 151730

When I open with Notepad++, I can see that the file working is in ANSI, and the one not working is in Macintosh.

If you Google that, you'll find that Notepad++'s encoding/code page auto detection isn't flawless.

If you want to write Cyrillic characters (which I assume you want, given the location in your profile) using an ANSI code page (which you want because the program you're writing the file for doesn't understand Unicode), the code page you want is Code Page 1251 Windows Cyrillic (Slavic). To get an encoding that writes characters in code points from that code page, use Encoding.GetEncoding():

using (StreamWriter sw = new StreamWriter(..., Encoding.GetEncoding("windows-1251")))
{
}

This is, assuming that the program that reads the files also uses that code page. That's the problem with non-Unicode text files, the writer and reader of the file have to agree on the encoding. So ultimately, you should find out which specific encoding the consuming application expects. I just assumed here that it's in fact Windows-1251.

Upvotes: 3

Related Questions