Reputation: 562
I have an XML file as following:
<?xml version="1.0" encoding="UTF-8"?>
<students xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<student name="Adnand"/>
<student name="özil"/>
<student name="ärnold"/>
</students>
As you see, I have an UTF-8 encoding, but I have used some non UTF-8 characters (ö, ä).
I use the following code to deserialize this XML:
public void readXML(string path)
{
XmlSerializer deserializer = new XmlSerializer(typeof(Students));
TextReader reader = new StreamReader(path);
object obj = deserializer.Deserialize(reader);
Students myStudents = (Students)obj;
}
The deserialization process it's ok, but the special characters are shown as � symbol. I tryed changing the encoding type, but nothing. Can someone help me what alternatives I have?
ANSWER You should specify the Encoding.Default like
public void readXML(string path)
{
XmlSerializer deserializer = new XmlSerializer(typeof(Students));
TextReader reader = new StreamReader(path, Encoding.Default);
object obj = deserializer.Deserialize(reader);
Students myStudents = (Students)obj;
}
Upvotes: 2
Views: 2567
Reputation: 1056
It seems your file is not encoded as UTF-8
but as Window's default ANSI
encoding.
Defining the StreamReader as
TextReader reader = new StreamReader(path, Encoding.Default)
should do the trick.
Note that this is more of a workaround and using Encoding.Default
is actually a very bad idea since it will break when using another Culture. This article gives a nice overview why you should not use Encoding.Default
(thanks to Alexander for sharing). It's better to use UTF-8 as most systems can deal with it.
In your specific case to actually save the file as UTF-8 you either have to:
Adapt the program that creates the file to output it as UTF-8
Or if you used a text editor to create the file, use a text editor that supports UTF-8 encoding (e.g. Notepad++).
Upvotes: 3
Reputation: 5203
This works for me:
class Program
{
static void Main(string[] args)
{
List<Student> students = new List<Student>();
XDocument xDocument = XDocument.Load("icsemmelle.xml");
List<XElement> xStudents = xDocument.Descendants("student").ToList();
foreach(XElement xStudent in xStudents)
{
students.Add(new Student { Name = xStudent.Attribute("name").Value });
}
}
}
class Student
{
public string Name { get; set; }
}
Upvotes: 0
Reputation: 9
You can use StreamReader to specify encoding
var Students xmlObject = null;
using (var streamReader = new StreamReader(inXML, Encoding.UTF8, true)) {
var xmlSerializer = new XmlSerializer(typeof(Students));
xmlObject = (Students)xmlSerializer.Deserialize(streamReader);
}
Also have you tried using the Encoding "ISO-8859-1", I use this mostly for foreign characters
Upvotes: 0