AJIOB
AJIOB

Reputation: 339

C# XML parsing. Need to get text

I have such code:

using System;
using System.IO;
using System.Xml.Serialization;

namespace ConsoleApp1
{
    [XmlRoot(ElementName = "doc")]
    public class Doc
    {
        [XmlElement(ElementName = "headline")]
        public string Headline { get; set; }
    }

    static class Program
    {
        static void Main(string[] args)
        {
            Doc res;

            var serializer = new XmlSerializer(typeof(Doc));
            using (var reader = new StringReader(File.ReadAllText("test.xml")))
            {
                res = (Doc) serializer.Deserialize(reader);
            }

            Console.Out.WriteLine(res.Headline.ToString());
        }
    }
}

My test.xml file contains such info:

<doc>
    <headline>AZERTY on the English <hlword>QWERTY</hlword> layout.
    </headline>
</doc>

When I try to parse it, I have an exception:

System.InvalidOperationException occurred
  HResult=0x80131509
  Message=There is an error in XML document (2, 35).
  Source=System.Xml
  StackTrace:
   at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
   at System.Xml.Serialization.XmlSerializer.Deserialize(TextReader textReader)
   at ConsoleApp1.Program.Main(String[] args) in D:\Documents\Visual Studio 2017\Projects\ConsoleApp1\ConsoleApp1\Program.cs:line 24

Inner Exception 1:
XmlException: Unexpected node type Element. ReadElementString method can only be called on elements with simple or empty content. Line 2, position 35.

I need to get output as AZERTY on the English <hlword>QWERTY</hlword> layout. or AZERTY on the English QWERTY layout. from such files. What type need I to set to Headline property of Doc to get such text (maybe with calling ToString() property)?

P.S. I'm using Windows 10 with Creators Update with VisualStudio 2017 (15.3.3)

Upvotes: 0

Views: 789

Answers (2)

Jante
Jante

Reputation: 11

The reason that you get an error is the hlword-tag in the content of the headline-element. If you wrap the content in , the content isn't parsed but read as is.

<doc>
    <headline><![CDATA[AZERTY on the English <hlword>QWERTY</hlword> layout.]]></headline>
</doc>

Upvotes: 0

StijnvanGaal
StijnvanGaal

Reputation: 441

The error is telling you that it cannot parse <headline>AZERTY on the English <hlword>QWERTY</hlword> layout. to a simple string as it has a element in it. This is called a mixed type. To parse this you need to edit your XMLObject to something like this

[XmlRoot(ElementName = "doc")]
public class Doc
{
    [XmlElement(ElementName = "headline")]
    public Headline Headline { get; set; }
}

public class Headline
{
    [XmlText]
    public string Content { get; set; }

    [XmlElement(ElementName = "hlword")]
    public string HlWord { get; set; }
}

Upvotes: 2

Related Questions