Reputation: 140753
The picture below explains all:
alt text http://img133.imageshack.us/img133/4206/accentar9.png
The variable textInput comes from File.ReadAllText(path);
and characters like : ' é è ... do not display. When I run my UnitTest, all is fine! I see them... Why?
Upvotes: 3
Views: 2471
Reputation: 7493
Try setting your console sessin's output code page using the chcp command. The code pages supported by windows are here, here, and here. Remember, fundametnaly the console is pretty simple: it displays UNCICODE or DBCS characters by using a code page to dtermine the glyph that will be displayed.
Upvotes: 1
Reputation: 545488
The .NET classes (System.IO.StreamReader
and the likes) take UTF-8 as the default encoding. If you want to read a different encoding you have to pass this explicitly to the appropriate constructor overload.
Also note that there's not one single encoding called “ANSI”. You're probably referring to the Windows codepage 1252 aka “Western European”. Notice that this is different from the Windows default encoding in other countries. This is relevant when you try to use System.Text.Encoding.Default
because this actually differs from system to system.
/EDIT: It seems you misunderstood both my answer and my comment:
So, finally: The solution to your problem should be the following code:
string text = System.IO.File.ReadAllText("path", Encoding.GetEncoding(1252));
The important part here is the usage of an appropriate System.Text.Encoding
instance.
However, this assumes that your encoding is indeed Windows-1252 (but I believe that's what Notepad++ means by “ANSI”). I have no idea why your text gets displayed correctly when read by NUnit. I suppose that NUnit either has some kind of autodiscovery for text encodings or that NUnit uses some weird defaults (i.e. not UTF-8).
Oh, and by the way: “ANSI” really refers to the “American National Standards Institute”. There are a lot of completely different standards that have “ANSI” as part of their names. For example, C++ is (among others) also an ANSI standard.
Only in some contexts it's (imprecisely) used to refer to the Windows encodings. But even there, as I've tried to explain, it usually doesn't refer to a specific encoding but rather to a class of encodings that Windows uses as defaults for different countries. One of these is Windows-1252.
Upvotes: 3
Reputation: 140753
I do not know why It works with NUnit, but I open the file with NotePad++ and I see ANSI in the format. Now I converted to UTF-8 and it works.
I am still wondering why it was working with NUnit and not in the console? but at least it works now.
Update I do not get why I get down voted on the question and in this answer because the question is still good, why in a Console I can't read an ANSI file but in NUNit I can?
Upvotes: -1