Character Encoding in .NET

Question

I have exported excel 2007 document as CSV (separated by semicolon). I am using CZECH office 2010 and czech windows 7.

When i read file in .net C#, text with special czech symbols is corrupted. It is when i am using

something like string[] lines = file.readalllines(path); (from System.IO.File)

So i guess i need to specially provide right encoding, right? so i tried:

string[] lines = File.ReadAllLines(path,encoding);

encoding variable was defined like

Encoding encoding = Encoding.UTF8 for example.

None of options worked. And strangest thing, some of them, like Encoding.Unicode even threw

IndexOutOfRandgeException

.

How should i fix this encoding problem? Thank you.

BTW, my office manages to open and read that document right way.

Wiktor Zychla · Accepted Answer

Most probably the encoding Excel writes your file is the default encoding of your system, which should be windows-1250. Either open your file with Encoding.Default or Encoding.GetEncoding("windows-1250"). It works for us here in Poland. I don't remember any issues regaring csvs exported from office.

Character Encoding in .NET

Answers (2)

Related Questions