alwaysVBNET
alwaysVBNET

Reputation: 3310

Parsing random XML strings

In our SQL Server database we have in a text field records which are strings but are written as xml with random nodes i.e.

<Record id="5">  
    <account>
        <OldValue>125025</OldValue>
        <NewValue></NewValue>
    </account>  
    <Amount>
        <OldValue>001995</OldValue>
        <NewValue></NewValue>
    </Amount>  
</Record id>

or

<Stock>
    <Car>
        <OldValue>035</OldValue>
        <NewValue>038 </NewValue>
    </Car>
</Stock>

I need to make these records readable in a string for a report such as:

ID: 5, 
Account Old Value: 125025, 
Account New Value: -
Amount Old Value: 001995, 
Amount New Value: -

I don't think I can serialize the string because I don't have a predifined class to serialize it with. I'm in the process of creating a function to read my random xml nodes but I'm not sure if I'm following the right approach. What is more, at the moment I'm getting an error: XMLException: 'id' is an unexpected token.

Public Shared Function XmlCustom(sValue As String)
    Dim output As StringBuilder = New StringBuilder()

    If Not String.IsNullOrEmpty(sValue) Then
        Dim xmlString As String = sValue
        Try
            ' Create an XmlReader
            Using reader As XmlReader = XmlReader.Create(New StringReader(xmlString))
                Dim ws As XmlWriterSettings = New XmlWriterSettings()
                ws.Indent = True
                Using writer As XmlWriter = XmlWriter.Create(output, ws)

                    ' Parse the file and display each of the nodes.
                    While reader.Read()
                        Select Case reader.NodeType
                            Case XmlNodeType.Element
                                writer.WriteStartElement(reader.Name)
                            Case XmlNodeType.Text
                                writer.WriteString(reader.Value)
                            Case XmlNodeType.XmlDeclaration
                            Case XmlNodeType.ProcessingInstruction
                                writer.WriteProcessingInstruction(reader.Name, reader.Value)
                            Case XmlNodeType.Comment
                                 writer.WriteComment(reader.Value)
                            Case XmlNodeType.Attribute
                                 writer.WriteElementString("id", reader.Value)

                            Case XmlNodeType.EndElement
                                 writer.WriteFullEndElement()
                        End Select
                    End While
                End Using
            End Using

        Catch ex As Exception
            MsgBox(output.ToString())
            MessageBox.Show("XmlCustom - " & ex.ToString)
        End Try
    End If

End Function

Any ideas how to tackle this problem?

Upvotes: 0

Views: 766

Answers (3)

Fabio
Fabio

Reputation: 32455

Without schema you can not use serializer.
Since most of report tools support DataTable as type for DataSource, you can collect data from Xml string to the DataTable

'Method which read elements and generate columns for DataTable and save values too
'Method recursively read every element and add elements names to Column name
Public Function ReadElement(el As XElement, columnName As StringBuilder) As IList(Of Tuple(Of DataColumn, String))
    Dim temp As New List(Of Tuple(Of DataColumn, String))()
    If el Is Nothing Then Return temp
    columnName.Append(el.Name)
    If el.HasElements = True Then
        For Each child As XElement In el.Elements
            temp.AddRange(ReadElement(child, columnName))
        Next
    Else
        Dim column As New DataColumn(columnName.ToString(), GetType(String))
        Dim value As String = el.Value
        Dim item As New Tuple(Of DataColumn, String)(column, value)
        temp.Add(item)
    End If
    Return temp
End Function

Creating DataTable

Dim mainElement as XElement = XElement.Parse(yourXmlStringValue)

Dim data As New DataTable()
data.Columns.Add("Id", GetType(String))

'Generate schema for DataTable
Dim scheemaAndValues As New List(Of Tuple(Of DataColumn, String))()
For Each child As XElement In mainElement.Elements
    Dim columnName As New StringBuilder()
    scheemaAndValues.AddRange(ReadElement(child, columnName))
Next

'Add columns to DataTable
For Each arvo As Tuple(Of DataColumn, String) In scheemaAndValues
   data.Columns.Add(arvo.Item1)
Next

'Add values to the row
Dim dr As DataRow = data.NewRow()
dr.SetField(Of String)("Id", mainElement.Attribute("id").Value)
For Each arvo As Tuple(Of DataColumn, String) In scheemaAndValues
   dr.SetField(arvo.Item1.ColumnName, arvo.Item2)
Next
data.Rows.Add(dr)

Upvotes: 2

jdweng
jdweng

Reputation: 34421

See code below. I used a combination of XMLReader and XML Linq. With code below you can get each element name without worrying the order of the objects. Another choice is to use recursion like the following C# project : Recursion, parsing xml file with attributes into treeview c#. It is pretty simply to convert the C# code to VB.Net.

Imports System.Xml
Imports System.Xml.Linq
Imports System.IO
Module Module1

    Sub Main()
        Dim xml As String = _
            "<?xml version=""1.0"" encoding=""utf-8""?>" & _
            "<Root>" & _
                "<Child>" & _
                    "<SubNode1></SubNode1>" & _
                    "<SubNode2></SubNode2>" & _
                    "<SubNode3></SubNode3>" & _
                "</Child>" & _
                "<Child>" & _
                    "<SubNode1></SubNode1>" & _
                    "<SubNode2></SubNode2>" & _
                    "<SubNode3></SubNode3>" & _
                "</Child>" & _
                "<Child>" & _
                    "<SubNode1></SubNode1>" & _
                    "<SubNode2></SubNode2>" & _
                    "<SubNode3></SubNode3>" & _
                "</Child>" & _
            "</Root>"

        Dim sReader As New StringReader(xml)
        Dim reader As New XmlTextReader(sReader)

        While Not reader.EOF
            If reader.Name <> "Child" Then
                reader.ReadToFollowing("Child")
            End If
            If Not reader.EOF Then
                Dim child As XElement = XElement.ReadFrom(reader)
                For Each element As XElement In child.Elements()
                    Dim name As String = element.Name.LocalName
                    Console.WriteLine(name)
                Next
            End If

        End While
        Console.ReadLine()
    End Sub

End Module

Upvotes: 0

jcmarin2298
jcmarin2298

Reputation: 41

The reason it can't is because you have to have a schema to feed the parser that describes the XML data structure. Your SQL record should have a schema definition somewhere so that the parser can pre-load it before it reads your data.

Below you will find a lot of information about the rules XML schema descriptors.

https://msdn.microsoft.com/en-us/library/ms256129(v=vs.110).aspx

Upvotes: -1

Related Questions