Reputation: 3241
I'm having some problems to deserialize a XML in .net. This is the error I'm getting:
The opening tag 'A' on line 72 position 56 does not match the end tag of 'a'. Line 72, position 118.
As you can see, is the same tag, but one is uppercase and the other is lower case. My XML has this structure:
<?xml version="1.0"?>
<translationfile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" _
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<translationtext>
<es_text>Spanish text</es_text>
<en_text>English text</en_text>
<developer_comment>Plain text</developer_comment>
</translationtext>
....
</translationfile>
And this is my vb class
Option Strict Off
Option Explicit On
Imports System.Xml.Serialization
'
'Este código fuente fue generado automáticamente por xsd, Versión=2.0.50727.3038.
'
'''<comentarios/>
<System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038"), _
System.SerializableAttribute(), _
System.Diagnostics.DebuggerStepThroughAttribute(), _
System.ComponentModel.DesignerCategoryAttribute("code"), _
System.Xml.Serialization.XmlTypeAttribute(AnonymousType:=True), _
System.Xml.Serialization.XmlRootAttribute([Namespace]:="", IsNullable:=False)> _
Partial Public Class translationfile
Private itemsField As List(Of translationfileTranslationtext)
'''<comentarios/>
<System.Xml.Serialization.XmlElementAttribute("translationtext", _
Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property Items As List(Of translationfileTranslationtext)
Get
Return Me.itemsField
End Get
Set(value As List(Of translationfileTranslationtext))
Me.itemsField = value
End Set
End Property
End Class
'''<comentarios/>
<System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038"), _
System.SerializableAttribute(), _
System.Diagnostics.DebuggerStepThroughAttribute(), _
System.ComponentModel.DesignerCategoryAttribute("code"), _
System.Xml.Serialization.XmlTypeAttribute(AnonymousType:=True)> _
Partial Public Class translationfileTranslationtext
Private es_textField As String
Private en_textField As String
Private developer_commentField As String
'''<comentarios/>
<System.Xml.Serialization.XmlElementAttribute _
(Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property es_text() As String
Get
Return Me.es_textField
End Get
Set(value As String)
Me.es_textField = value
End Set
End Property
'''<comentarios/>
<System.Xml.Serialization.XmlElementAttribute( _
Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property en_text() As String
Get
Return Me.en_textField
End Get
Set(value As String)
Me.en_textField = value
End Set
End Property
'''<comentarios/>
<System.Xml.Serialization.XmlElementAttribute( _
Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property developer_comment() As String
Get
Return Me.developer_commentField
End Get
Set(value As String)
Me.developer_commentField = value
End Set
End Property
End Class
The problem is that both text could contain HTML code. The XML is generated manually by the clients and I cannot change the text inside these tags. Also they could define their owns tags like <client27tagname>...</client27tagname>
. For example. This is a real case:
<translationtext>
<es_text><p>Nombre</P></es_text>
<en_text><p>Name</P></en_text>
<developer_comment>irrelevant text</developer_comment>
</translationtext>
When I try to deserialize a XML file, I'm getting the previous error because <p>
is lower case and </P>
is upper case. How can I desarialize it correctly without changing the text? Is there any possibility to treat all the text inside these tags as simple string?
This is the code I'm using for deserialize:
Dim stream As New IO.StreamReader(path)
Dim ser As New Xml.Serialization.XmlSerializer(GetType(translationfile))
Dim myperfil As New translationfile
myperfil = CType(ser.Deserialize(stream), translationfile) 'This line throws the exception
stream.Close()
UPDATE
After doing the change suggested by Olivier. This is my class:
Option Strict Off
Option Explicit On
Imports System.Xml.Serialization
<System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038"), _
System.SerializableAttribute(), _
System.Diagnostics.DebuggerStepThroughAttribute(), _
System.ComponentModel.DesignerCategoryAttribute("code"), _
System.Xml.Serialization.XmlTypeAttribute(AnonymousType:=True), _
System.Xml.Serialization.XmlRootAttribute([Namespace]:="", IsNullable:=False)> _
Partial Public Class translationfile
Private itemsField As List(Of translationfileTranslationtext)
<System.Xml.Serialization.XmlElementAttribute("translationtext", Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property Items As List(Of translationfileTranslationtext)
Get
Return Me.itemsField
End Get
Set(value As List(Of translationfileTranslationtext))
Me.itemsField = value
End Set
End Property
End Class
<System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038"), _
System.SerializableAttribute(), _
System.Diagnostics.DebuggerStepThroughAttribute(), _
System.ComponentModel.DesignerCategoryAttribute("code"), _
System.Xml.Serialization.XmlTypeAttribute(AnonymousType:=True)> _
Partial Public Class translationfileTranslationtext
Private es_textField As String
Private en_textField As String
Private developer_commentField As String
<XmlIgnore()>
Public Property es_text() As String
Get
Return Me.es_textField
End Get
Set(value As String)
Me.es_textField = value
End Set
End Property
<XmlElement(ElementName:="es_text", Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property es_HtmlText() As String
Get
Return System.Web.HttpUtility.HtmlEncode(Me.es_textField)
End Get
Set(value As String)
Me.es_textField = HttpUtility.HtmlDecode(value)
End Set
End Property
<XmlIgnore()>
Public Property en_text() As String
Get
Return Me.en_textField
End Get
Set(value As String)
Me.en_textField = value
End Set
End Property
<XmlElement(ElementName:="en_text", Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property en_HtmlText() As String
Get
Return System.Web.HttpUtility.HtmlEncode(Me.en_textField)
End Get
Set(value As String)
Me.en_textField = HttpUtility.HtmlDecode(value)
End Set
End Property
<System.Xml.Serialization.XmlElementAttribute(Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property developer_comment() As String
Get
Return Me.developer_commentField
End Get
Set(value As String)
Me.developer_commentField = value
End Set
End Property
End Class
Upvotes: 0
Views: 2251
Reputation: 112537
Use HttpUtility.HtmlEncode
to encode your text and HttpUtility.HtmlDecode
to decode it.
You could create an additional property for this and exclude the original property from serialization.
'Exclude the original property from serialization
<XmlIgnore()> _
Public Property en_text() As String
Get
Return Me.en_textField
End Get
Set(value As String)
Me.en_textField = value
End Set
End Property
'Name the encoding/decoding property element like the original property
<XmlElement(ElementName := "en_text", Form:=XmlSchemaForm.Unqualified)> _
Public Property en_HtmlEncodedText() As String
Get
Return HttpUtility.HtmlEncode(Me.en_textField)
End Get
Set(value As String)
Me.en_textField = HttpUtility.HtmlDecode(value)
End Set
End Property
Html encoding will translate the "<"
and ">"
into "<"
and ">"
and thus make the inner tags invisible to XML.
UPDATE
Mt solution works. I have tested it now. You have probably tested it with an XML still containing the html tags in plain text ("<p>Name</P>"
). What my code does is to write the html as "&lt;p&gt;Name&lt;/P&gt;"
. This is what HttpUtility.HtmlEncode
does. Therefore you must start by writing an XML file using my method. Only then, reading will succeed.
Here is my write test:
Public Sub WriteTest()
Dim myperfil As New translationfile With {
.Items = New List(Of translationfileTranslationtext) From {
New translationfileTranslationtext With {.en_text = "en test", .es_text = "spanish"},
New translationfileTranslationtext With {.en_text = "<p>Name</P>", .es_text = "<p>Nombre</P>"}
}
}
Dim writer As New IO.StreamWriter(path)
Dim ser As New XmlSerializer(GetType(translationfile))
ser.Serialize(writer, myperfil)
writer.Close()
End Sub
It creates the following XML:
?xml version="1.0" encoding="utf-8"?>
<translationfile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<translationtext>
<es_text>spanish</es_text>
<en_text>en test</en_text>
</translationtext>
<translationtext>
<es_text>&lt;p&gt;Nombre&lt;/P&gt;</es_text>
<en_text>&lt;p&gt;Name&lt;/P&gt;</en_text>
</translationtext>
</translationfile>
And here is my read test, which throws no exception:
Public Sub ReadTest()
Dim myperfil As translationfile
Dim reader As New IO.StreamReader(path)
Dim ser As New XmlSerializer(GetType(translationfile))
myperfil = CType(ser.Deserialize(reader), translationfile)
reader.Close()
For Each item As translationfileTranslationtext In myperfil.Items
Console.WriteLine("EN = {0}, ES = {1}", item.en_text, item.es_text)
Next
Console.ReadKey()
End Sub
It write this to the console:
EN = en test, ES = spanish
EN = <p>Name</P>, ES = <p>Nombre</P>
Upvotes: 1
Reputation: 3241
After some test i found a workaround.
<
characters to a default string: #open_key#
#open_key#es_text>
to <es_text>
#open_key#
to <
Upvotes: 0