user2722517
user2722517

Reputation: 39

XML using LINQ with VB.NET

Below is a copy of my xml. Basically what I have is different crops(rice, soybeans, corn, etc) within each crop I have a field node and within that node there are several nodes describing the field. I also have an expense node that contains several expenses within that field node. Expenses can vary within each field which is why i have a node that can contain all of them. What I want to do is find an easy way to query this stuff. Lets say i want to get all the info for one particular field. Well in my app if they select a certain crop then i need to list all the fields in a combobox. Is my structure correct or is there another way to do this. What if i want to add all the expenses up for all the fields in a particular crop and lets say there are 20-30 fields. How can i do this? I know how to select certain things where value = 'something' but the main thing i need to do select all fields for a particular crop and add the expeneses. Or if i select a crop and field in my app then i need to be able to pull all the data from the field node. Any help would be great!

<Data>
 <Crop>
  <Name>Corn</Name>
    <Field>
      <Name>Field1</Name>
      <Acres></Acres>
      <Bushels></Bushels>
        <Expenses>
           <expense1></expense1>
           <expense2></expense2>
           etc....
        </Expenses>
     </Field>
     <Field>
      <Name>Field2</Name>
      <Acres></Acres>
      <Bushels></Bushels>
        <Expenses>
           <expense1></expense1>
           <expense2></expense2>
           etc....
        </Expenses>
     </Field>
  </Crop>
  <Crop>
  <Name>Soybeans</Name>
    <Field>
      <Name>Field2</Name>
      <Acres></Acres>
      <Bushels></Bushels>
        <Expenses>
           <expense1></expense1>
           <expense2></expense2>
           etc....
        </Expenses>
     </Field>
     <Field>
      <Name>Field1</Name>
      <Acres></Acres>
      <Bushels></Bushels>
        <Expenses>
           <expense1></expense1>
           <expense2></expense2>
           etc....
        </Expenses>
     </Field>
  </Crop>
  <Crop>
   <Name>Rice</Name>
    <Field>
      <Name>Field3</Name>
      <Acres></Acres>
      <Bushels></Bushels>
        <Expenses>
           <expense1></expense1>
           <expense2></expense2>
           etc....
        </Expenses>
     </Field>
     <Field>
      <Name>Field4</Name>
      <Acres></Acres>
      <Bushels></Bushels>
        <Expenses>
           <expense1></expense1>
           <expense2></expense2>
           etc....
        </Expenses>
     </Field>
     <Field>
      <Name>Field5</Name>
      <Acres></Acres>
      <Bushels></Bushels>
        <Expenses>
           <expense1></expense1>
           <expense2></expense2>
           etc....
        </Expenses>
     </Field>
  </Crop>

Upvotes: 0

Views: 1190

Answers (3)

xDaevax
xDaevax

Reputation: 2022

Your XML structure makes sense (though I made some modifications that I list below). Like a database, your XML data should be structured in a way that makes logical sense for the relationships and data it represents. The XML should also be readable, which yours seems to be as well. That being said, there are many ways to "skin this cat" and I will outline two of them here.

Here is the modified XML file:

<Data>
    <Crops>
        <Crop>
            <Name>Corn</Name>
            <Fields>
                <Field>
                    <Name>Field1</Name>
                    <Acres>2</Acres>
                    <Bushels>15</Bushels>
                    <Expenses>
                        <expense>12.0</expense>
                        <expense>20.0</expense>
                    </Expenses>
                </Field>
                <Field>
                    <Name>Field2</Name>
                    <Acres>30</Acres>
                    <Bushels>60</Bushels>
                    <Expenses>
                        <expense>45.0</expense>
                        <expense>70.0</expense>
                    </Expenses>
                </Field>
            </Fields>
        </Crop>
        <Crop>
            <Name>Soybeans</Name>
            <Fields>
                <Field>
                    <Name>Field2</Name>
                    <Acres>5</Acres>
                    <Bushels>1</Bushels>
                    <Expenses>
                        <expense>33.0</expense>
                        <expense>71.0</expense>
                    </Expenses>
                </Field>
                <Field>
                    <Name>Field1</Name>
                    <Acres>10</Acres>
                    <Bushels>20</Bushels>
                    <Expenses>
                        <expense>15.0</expense>
                        <expense>10.0</expense>
                    </Expenses>
                </Field>
            </Fields>
        </Crop>
        <Crop>
            <Name>Rice</Name>
            <Fields>
                <Field>
                    <Name>Field3</Name>
                    <Acres>15</Acres>
                    <Bushels>40</Bushels>
                    <Expenses>
                        <expense>15.50</expense>
                        <expense>44.79</expense>
                    </Expenses>
                </Field>
                <Field>
                    <Name>Field4</Name>
                    <Acres>4</Acres>
                    <Bushels>18</Bushels>
                    <Expenses>
                        <expense>0.0</expense>
                        <expense>21.0</expense>
                    </Expenses>
                </Field>
                <Field>
                    <Name>Field5</Name>
                    <Acres>15</Acres>
                    <Bushels>4</Bushels>
                    <Expenses>
                        <expense>62.0</expense>
                        <expense>27.45</expense>
                    </Expenses>
                </Field>
            </Fields>
        </Crop>
    </Crops>
</Data>

I created both of these as a unit test so they should both run as-is.

The first one, using LINQ (as is requested in your question) would look something like this the following. The general strategy is to load the xml file into and XmlDocument instance and use XPath syntax to query the nodes and traverse the document.

Dim xmlData As String = "<Data><Crops><Crop><Name>Corn</Name><Fields><Field><Name>Field1</Name><Acres>2</Acres><Bushels>15</Bushels><Expenses><expense>12.0</expense><expense>20.0</expense></Expenses></Field><Field><Name>Field2</Name><Acres>30</Acres><Bushels>60</Bushels><Expenses><expense>45.0</expense><expense>70.0</expense></Expenses></Field></Fields></Crop><Crop><Name>Soybeans</Name><Fields><Field><Name>Field2</Name><Acres>5</Acres><Bushels>1</Bushels><Expenses><expense>33.0</expense><expense>71.0</expense></Expenses></Field><Field><Name>Field1</Name><Acres>10</Acres><Bushels>20</Bushels><Expenses><expense>15.0</expense><expense>10.0</expense></Expenses></Field></Fields></Crop><Crop><Name>Rice</Name><Fields><Field><Name>Field3</Name><Acres>15</Acres><Bushels>40</Bushels><Expenses><expense>15.50</expense><expense>44.79</expense></Expenses></Field><Field><Name>Field4</Name><Acres>4</Acres><Bushels>18</Bushels><Expenses><expense>0.0</expense><expense>21.0</expense></Expenses></Field><Field><Name>Field5</Name><Acres>15</Acres><Bushels>4</Bushels><Expenses><expense>62.0</expense><expense>27.45</expense></Expenses></Field></Fields></Crop></Crops></Data>"
Dim doc As New System.Xml.XmlDocument()
doc.LoadXml(xmlData) 'Load the data from the string'
'If loading from a file, use this'
'doc.Load("C:\MyFileName.xml")'
Dim cropNodes As Xml.XmlNodeList = doc.SelectNodes("//Crop/Name") 'Use XPath syntax for querying'
Dim cropNames As New List(Of String)()
'Here, Linq is used to get all of the values of the name nodes within each crop (assuming each crop has only one name).'
cropNames.AddRange((From x As Xml.XmlNode In cropNodes Select x.InnerText).ToArray())

cropNames.ForEach(Sub(crop)
            Debug.WriteLine("Crop: " & crop)
        End Sub)

Dim fieldData As Xml.XmlNodeList = doc.SelectNodes("//Field")
For Each item As Xml.XmlNode In fieldData
    Debug.WriteLine("Field Name: " & item.SelectSingleNode("Name").InnerText())
Next

'Get expenses per field for each crop'
For Each name As String In cropNames
    Dim cropNode As Xml.XmlNode = doc.SelectSingleNode("//Crop/Name[text()='" & name & "']/..")
    Debug.WriteLine("Num Fields: " & cropNode.SelectNodes("Fields/Field").Count)
    For Each node As Xml.XmlNode In cropNode.SelectNodes("Fields/Field")
        Debug.WriteLine("Total Expenses for " & node.SelectSingleNode("Name").InnerText & ":$" & (From item As Xml.XmlNode In node.SelectNodes("Expenses/expense") Select Double.Parse(item.InnerText)).Sum())
    Next
Next
Debug.WriteLine("Done")

The next way (and this would be my preference) would be to just deserialize the XML into strongly typed .NET objects. This way, you can create functions within each class to do the work for you and avoid the XPath syntax all together and it becomes easier to add custom functionality.

Here is an example:

'In some test code:
Dim xmlData As String = "<Data><Crops><Crop><Name>Corn</Name><Fields><Field><Name>Field1</Name><Acres>2</Acres><Bushels>15</Bushels><Expenses><expense>12.0</expense><expense>20.0</expense></Expenses></Field><Field><Name>Field2</Name><Acres>30</Acres><Bushels>60</Bushels><Expenses><expense>45.0</expense><expense>70.0</expense></Expenses></Field></Fields></Crop><Crop><Name>Soybeans</Name><Fields><Field><Name>Field2</Name><Acres>5</Acres><Bushels>1</Bushels><Expenses><expense>33.0</expense><expense>71.0</expense></Expenses></Field><Field><Name>Field1</Name><Acres>10</Acres><Bushels>20</Bushels><Expenses><expense>15.0</expense><expense>10.0</expense></Expenses></Field></Fields></Crop><Crop><Name>Rice</Name><Fields><Field><Name>Field3</Name><Acres>15</Acres><Bushels>40</Bushels><Expenses><expense>15.50</expense><expense>44.79</expense></Expenses></Field><Field><Name>Field4</Name><Acres>4</Acres><Bushels>18</Bushels><Expenses><expense>0.0</expense><expense>21.0</expense></Expenses></Field><Field><Name>Field5</Name><Acres>15</Acres><Bushels>4</Bushels><Expenses><expense>62.0</expense><expense>27.45</expense></Expenses></Field></Fields></Crop></Crops></Data>"
Dim crops As CropData = Nothing

Dim ser As New Xml.Serialization.XmlSerializer(GetType(CropData))
Try
    crops = ser.Deserialize(Xml.XmlReader.Create(New IO.StringReader(xmlData)))
    For Each item In crops.Crops
        Debug.WriteLine("Total Expenses for crop: " & item.Name & ": $" & item.GetTotalExpenses())
    Next
Catch ex As Exception

End Try

'Other classes:
<Serializable()>
<Xml.Serialization.XmlRoot("Data")>
Public Class CropData

    'XmlIgnore is somewhat unnecessary on a private instance, but I like to be explicit
    <Xml.Serialization.XmlIgnore()> _
    Private _crops As List(Of Crop)

    Public Sub New()
        _crops = New List(Of Crop)()
    End Sub

    <Xml.Serialization.XmlArray(ElementName:="Crops"), Xml.Serialization.XmlArrayItem("Crop")>
    Public Property Crops As List(Of Crop)
        Get
            Return _crops
        End Get
        Set(value As List(Of Crop))
            If value IsNot Nothing Then
                _crops = value
            End If
        End Set
    End Property

End Class

<Serializable()>
Public Class Crop

    <Xml.Serialization.XmlIgnore()>
    Private _fields As List(Of Field)

    Public Sub New()
        _fields = New List(Of Field)()
    End Sub

    Public Function GetTotalExpenses() As Double
        Dim total As Double = 0D
        If Me.Fields.Any() Then
            Me.Fields.ForEach(Sub(item)
                total += item.GetFieldExpenses()
            End Sub)
        End If
        Return total
    End Function

    <Xml.Serialization.XmlElement(ElementName:="Name")>
    Public Property Name As String

    <Xml.Serialization.XmlArray(ElementName:="Fields"), Xml.Serialization.XmlArrayItem(ElementName:="Field")>
    Public Property Fields As List(Of Field)
        Get
            Return _fields
        End Get
        Set(value As List(Of Field))
            If value IsNot Nothing Then
                _fields = value
            End If
        End Set
    End Property

End Class

<Serializable()>
Public Class Field

    Public Sub New()

    End Sub

    Public Function GetFieldExpenses() As Double
        If Me.Expenses.Any() Then
            Return Me.Expenses.Sum()
        Else
            Return 0
        End If
    End Function


    <Xml.Serialization.XmlElement(ElementName:="Name")>
    Public Property Name As String
    <Xml.Serialization.XmlElement(ElementName:="Acres")>
    Public Property Acres As Double
    <Xml.Serialization.XmlElement(ElementName:="Bushels")>
    Public Property Bushels As Double

    <Xml.Serialization.XmlArray("Expenses"), Xml.Serialization.XmlArrayItem("expense")>
    Public Property Expenses As List(Of Double)

End Class

Upvotes: 1

Garrie
Garrie

Reputation: 1

If I want to load an XML file with Vb.net I check if it is well formed by attempting to load it into Excel using the XML feature on the data ribbon. Couldn't get your format to load (after cleaning up the etc's..). However the following seems to work:

<TblCrops>
<Field>
<Crop>Corn</Crop>
<Name>Field1</Name>
<Acres>120</Acres>
<Bushels>1000</Bushels>
<Expenses>
<expense1>100</expense1>
<expense2>200</expense2>
</Expenses>
</Field>
<Field>
<Crop>Corn</Crop>
<Name>Field2</Name>
<Acres>130</Acres>
<Bushels>1100</Bushels>
<Expenses>
<expense1>110</expense1>
<expense2>210</expense2>
</Expenses>
</Field>
</TblCrops>
<TblCrops>
<Field>
<Crop>Corn</Crop>
<Name>Field1</Name>
<Acres>120</Acres>
<Bushels>1000</Bushels>
<Expenses>
<expense1>100</expense1>
<expense2>200</expense2>
</Expenses>
 </Field>
<Field>
<Crop>Corn</Crop>
<Name>Field2</Name>
<Acres>130</Acres>
<Bushels>1100</Bushels>
<Expenses>
<expense1>110</expense1>
<expense2>210</expense2>
</Expenses>
</Field>
</TblCrops>

You could then perform a select query on the crop field and add all expense fields together. If you did not want the number of fields per record to change, depending on the number of expenses you could pull out the expenses into another field and relate them by a randomly generated record locator. IE

 <TblCrops>
 <Field>
 <Crop>Corn</Crop>
 <Name>Field1</Name>
 <Acres>120</Acres>
 <Bushels>1000</Bushels>
 <ExpenseID>106724</ExpenseID>
 </Field>
 <Field>
 <Crop>Corn</Crop>
 <Name>Field2</Name>
 <Acres>130</Acres>
 <Bushels>1100</Bushels>
 <ExpenseID>879065</ExpenseID>
 </Field>
 </TblCrops>

and

<TblExpenses>
<Expense>
<ExpenseID>106724</ExpenseID>
<amount>xxx</amount>
</Expense>
<Expense>
<ExpenseID>879065</ExpenseID>
<amount>yyy</amount>
</Expense>
<ExpenseID>879065</ExpenseID>
<amount>zzz</amount>
</Expense>
</TblExpenses>

Then you would need some LEFT OUTER JOIN trickery with a select statement on the field to produce the final record set of expenses from the two tables.

Upvotes: 0

Jerry Federspiel
Jerry Federspiel

Reputation: 1524

Check out Linq to Xml and VB's xml operators for a concise solution:

Function TotalExpenses(myData as XElement, myCropName as String) as Double
    Dim expenseElements = (From crop In myData.<Crop>
                          Where crop.<Name>.FirstOrDefault().Value = myCropName
                          From expenseElement In element.Descendants("Expenses").Descendants()
                          Select expenseElement)

    Return expenseElements.Sum(Function(ee) Double.Parse(ee.Value))
End Function

Upvotes: 0

Related Questions