JenPann
JenPann

Reputation: 61

Change attributes in nested XML

I have an XML file with nested elements/nodes. I need to increment the <proceduralStep> attribute "id" for each node and its child nodes. My first problem is I haven't been able to make changes to the attribute using node.Attributes("id").Value = node.Attributes("id").Value + 1. It gives an error on the node.Attributes("id").Value +1. This is the parent element /proceduralStep/. Second problem is I need each nodes attribute to be changed if it's a child element of <proceduralStep>. So if it's /proceduralStep/proceduralStep that attributes id will be changed to 1.1. I've been searching the net for examples and explanations on how to do this but haven't found any that work.

Sample XML

<dmodule>
  <mainProcedure>
    <proceduralStep id="step1">
      <para>Step 1</para>
    </proceduralStep>
    <proceduralStep id="step2">
      <figure id="fig2">
        <title>xxxxx</title>
        <graphic infoEntityIdent="ICN-GAASI"></graphic>
      </figure>
    </proceduralStep>
    <proceduralStep id="step3">
      <para>Step 3 with link to step 2 (ID 23) here:
                <internalRef internalRefId="step2" internalRefTargetType="step"></internalRef></para>
      <figure id="fig3">
        <title>xxxxx</title>
        <graphic infoEntityIdent="ICN-GAASIB0"></graphic>
      </figure>
      <proceduralStep id="step3.1">
        <para>Step 3.2 with link to step 3.1 (ID 23a) here:
                    <internalRef internalRefId="step3.1" internalRefTargetType="step"></internalRef></para>
      </proceduralStep>
      <proceduralStep id="step3.2">
        <figure>
          <title>xxxxx</title>
          <graphic infoEntityIdent="ICN-GAASIB0-00-"></graphic>
        </figure>
        <proceduralStep id="step3.2.1">
          <figure>Step 3.3.1</figure>
        </proceduralStep>
        <proceduralStep id="step3.2.2">
          <para>Step 3.3.2 with link to step 3.3.1 (ID 23c1) here:
                        <internalRef internalRefId="step3.2.1" internalRefTargetType="step"></internalRef></para>
        </proceduralStep>
        <proceduralStep id="step3.2.3">
          <figure>Step 3.3.3</figure>
        </proceduralStep>
      </proceduralStep>
    </proceduralStep>
  </mainProcedure>
</dmodule>

Not working code


        Dim doc As XDocument = XDocument.Load(FILENAME)
        Dim directoryName As String = Path.GetDirectoryName(FILENAME)
        Dim root As XElement = doc.Root
        Dim prefixStep As String = "step"
        Dim prefixFig As String = "fig"
        Dim nameResult As String = Path.GetFileName(FILENAME)
        Dim ns As XNamespace = root.GetDefaultNamespace()
        Dim mainProcedure As XElement = root.Descendants("mainProcedure").FirstOrDefault()

        RenumberStep(mainProcedure, prefixStep, ns)
        RenumberFigures(mainProcedure, prefixFig, ns)

        For Each internalRef As XElement In doc.Descendants(ns + "internalRef")
        Dim oldId As String = CType(internalRef.Attribute("internalRefId"), String)
            If Not oldId Is Nothing Then
                If dictionary.ContainsKey(oldId) Then
                    internalRef.SetAttributeValue("internalRefId", dictionary(oldId))
                Else
                    '  internalRef.SetAttributeValue("internalRefId", "Error : " & oldId)
                End If
            End If
        Next internalRef

        doc.Save(FILENAME)
Module Module1
    Public dictionary As New Dictionary(Of String, String)
    Public dictionaryFig As New Dictionary(Of String, String)

    Sub RenumberStep(parent As XElement, prefix As String, ns As XNamespace)
        Dim index As Integer = 1
        For Each proceduralStep As XElement In parent.Elements(ns + "proceduralStep")
            Dim oldId = CType(proceduralStep.Attribute("id"), String)
            If Not oldId Is Nothing Then
                dictionary.Add(oldId, prefix + index.ToString())
                proceduralStep.SetAttributeValue("id", prefix + index.ToString())
                RenumberStep(proceduralStep, prefix + index.ToString() + ".", ns)
            Else
                proceduralStep.SetAttributeValue("id", prefix + index.ToString())
            End If
            index = index + 1
        Next proceduralStep
    End Sub

    Sub RenumberFigures(parent As XElement, prefix As String, ns As XNamespace)
        Dim index As Integer = 1

        For Each figure As XElement In parent.Elements(ns + "figure")
            Dim oldfigId = CType(figure.Attribute("id"), String)
            If Not oldfigId Is Nothing Then
                dictionaryFig.Add(oldfigId, prefix + index.ToString())
                figure.SetAttributeValue("id", prefix + index.ToString())
                RenumberFigures(figure, prefix + index.ToString() + ".", ns)
            Else
                figure.SetAttributeValue("id", prefix + index.ToString())
            End If
            index = index + 1
        Next figure
    End Sub
End Module

Upvotes: 0

Views: 602

Answers (3)

jdweng
jdweng

Reputation: 34421

Real simple using a recursive algorithm and xml linq :

Module Module1
    Const FILENAME As String = "c:\temp\test.xml"
    Const OUTPUT_FILENAME As String = "c:\temp\test1.xml"

    Public dictionary As New Dictionary(Of String, String)
    Sub Main()
        Dim doc As XDocument = XDocument.Load(FILENAME)
        Dim root As XElement = doc.Root
        Dim ns As XNamespace = root.GetDefaultNamespace()
        Dim mainProcedure As XElement = root.Descendants("mainProcedure").FirstOrDefault()
        Dim prefix As String = "step"
        Renumber(mainProcedure, prefix, ns)

        For Each internalRef As XElement In doc.Descendants(ns + "acronymTerm")
            Dim oldId As String = CType(internalRef.Attribute("internalRefId"), String)
            If Not oldId Is Nothing Then

                If dictionary.ContainsKey(oldId) Then
                    internalRef.SetAttributeValue("internalRefId", dictionary(oldId))
                Else
                    internalRef.SetAttributeValue("internalRefId", "Error : " & oldId)
                End If
            End If
        Next internalRef

        doc.Save(OUTPUT_FILENAME)
    End Sub

    Sub Renumber(parent As XElement, prefix As String, ns As XNamespace)
        Dim index As Integer = 1
        For Each proceduralStep As XElement In parent.Elements(ns + "proceduralStep")
            Dim oldId = CType(proceduralStep.Attribute("id"), String)
            dictionary.Add(oldId, prefix + index.ToString())
            proceduralStep.SetAttributeValue("id", prefix + index.ToString())

            Renumber(proceduralStep, prefix + index.ToString() + ".", ns)
            index = index + 1
        Next proceduralStep
    End Sub

End Module

Output

<?xml version="1.0" encoding="utf-8"?>
<dmodule xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.s1000d.org/S1000D_4-0-1/xml_schema_flat/proced.xsd">
  <mainProcedure>
    <proceduralStep id="step1">
      <para>XX  xxx<acronym id="mosim">XX  xxx<acronymTerm>XX  xxx</acronymTerm>XX  xxx<acronymDefinition>XX  xxx</acronymDefinition>XX  xxx</acronym>XX  xxx<internalRef internalRefId="Error : Error : fig1" internalRefTargetType="figure" targetTitle="fig1">XX  xxx</internalRef>XX  xxx</para>
      <proceduralStep id="step1.1">
        <para>XX  xxx</para>
      </proceduralStep>
      <proceduralStep id="step1.2">
        <para>XX  xxx<acronymTerm internalRefId="mosim">XX  xxx</acronymTerm>XX  xxx</para>
      </proceduralStep>
      <proceduralStep id="step1.3">
        <para>XX  xxx</para>
      </proceduralStep>
    </proceduralStep>
    <proceduralStep id="step2">
      <para>XX  xxx<acronymTerm internalRefId="mosim">XX  xxx</acronymTerm>XX  xxx<internalRef internalRefId="Error : Error : fig1" internalRefTargetType="figure" targetTitle="fig1">XX  xxx</internalRef>XX  xxx</para>
      <proceduralStep id="step2.1">
        <para>XX  xxx<acronymTerm internalRefId="mosim">XX  xxx</acronymTerm>XX  xxx</para>
      </proceduralStep>
      <proceduralStep id="step2.2">
        <para>XX  xxx<acronymTerm internalRefId="mosim">XX  xxx</acronymTerm>XX  xxx</para>
      </proceduralStep>
    </proceduralStep>
    <proceduralStep id="step3">
      <para>XX  xxx<emphasis>XX  xxx</emphasis>XX  xxx</para>
    </proceduralStep>
    <proceduralStep id="step4">
      <para>XX  xxx<emphasis>XX  xxx</emphasis>XX  xxx</para>
    </proceduralStep>
    <proceduralStep id="step5">
      <para>XX  xxx<acronymTerm internalRefId="lola">XX  xxx</acronymTerm>XX  xxx<acronym id="cd">XX  xxx<acronymTerm>XX  xxx</acronymTerm>XX  xxx<acronymDefinition>XX  xxx</acronymDefinition>XX  xxx</acronym>XX  xxx<acronymTerm internalRefId="mosim">XX  xxx</acronymTerm>XX  xxx<acronymTerm internalRefId="cd">XX  xxx</acronymTerm>XX  xxx<acronym id="dvd">XX  xxx<acronymTerm>XX  xxx</acronymTerm>XX  xxx<acronymDefinition>XX  xxx</acronymDefinition>XX  xxx</acronym>XX  xxx</para>
    </proceduralStep>
  </mainProcedure>
</dmodule>

Upvotes: 1

dbasnett
dbasnett

Reputation: 11773

So here is a guess since there isn't a clearly defined 'increment'. Used literal for testing.

    Dim strXMLPath As String = "C:\Test\34 XML Parsing\XML File\CascadingStepsExample.xml"

    Dim doc As XElement
    ' doc = XElement.Load(strXMLPath)

    'for texting
    doc = <dmodule>
              <proceduralStep id="step21">
                  <para>Step 1</para>
              </proceduralStep>
              <proceduralStep id="step22">
                  <para>Step 2</para>
              </proceduralStep>
              <proceduralStep id="step23">
                  <para>Step 3 with link to step 2 (ID 23) here:
  <internalRef internalRefId="step23" internalRefTargetType="step"></internalRef>
                  </para>
                  <proceduralStep id="step23a">
                      <para>Step 3.1</para>
                  </proceduralStep>
                  <proceduralStep id="step23b">
                      <para>Step 3.2 with link to step 3.1 (ID 23a) here:
    <internalRef internalRefId="step23a" internalRefTargetType="step"></internalRef>
                      </para>
                  </proceduralStep>
                  <proceduralStep id="step23c">
                      <para>Step 3.3</para>
                      <proceduralStep id="step23c1">
                          <para>Step 3.3.1</para>
                      </proceduralStep>
                      <proceduralStep id="step23c2">
                          <para>Step 3.3.2 with link to step 3.3.1 (ID 23c1) here:
      <internalRef internalRefId="step23c1" internalRefTargetType="step"></internalRef>
                          </para>
                      </proceduralStep>
                      <proceduralStep id="step23c3">
                          <para>Step 3.3.3</para>
                      </proceduralStep>
                      <!-- end of step 3.3.3-->
                  </proceduralStep>
                  <!--end of step 3.3-->
              </proceduralStep>
              <!--end of step 3-->
          </dmodule>

    For Each el As XElement In doc...<proceduralStep>
        'increment logic GUESS
        Dim prfx As New System.Text.StringBuilder
        Dim num As New System.Text.StringBuilder
        Dim sufx As New System.Text.StringBuilder
        Dim id As New System.Text.StringBuilder(el.@id)
        Dim inNum As Boolean = False
        Dim inSuf As Boolean = False
        For x As Integer = 0 To id.Length - 1
            Dim n As Boolean = False
            If Integer.TryParse(id(x), Nothing) Then
                n = True
            End If
            Select Case True
                Case inSuf
                    sufx.Append(id(x))
                Case inNum AndAlso n
                    num.Append(id(x))
                Case inNum
                    inSuf = True
                    sufx.Append(id(x))
                Case n
                    inNum = True
                    num.Append(id(x))
                Case Else
                    prfx.Append(id(x))
            End Select
        Next
        If num.Length > 0 Then
            Dim i As Integer = Integer.Parse(num.ToString)
            i += 1
            el.@id = prfx.ToString & i.ToString & sufx.ToString
        End If
    Next

Upvotes: 0

djv
djv

Reputation: 15774

This node.Attributes("id").Value + 1 won't work because you can't add the number 1 to a string. So how do you increment your id? From what I gather, you need to increment these values

  • "step21"
  • "step22"
  • "step23"
  • "step23a"
  • "step23b"
  • "step23c"
  • "step23c1"
  • "step23c2"
  • "step23c3"

You need to define this. Write a function which takes a string and increments it according to your rules. If you provide the incremented values, we can help with the logic, probably. So add your new increment function,

Public Function IncrementId(id As String) As String
    Return id & "incremented"
End Function

and change your code to call your new increment function,

node("proceduralStep").SetAttribute("id", IncrementId(node.Attributes("id").Value))

and I guess that should do it.


However, I prefer Xml serialization over XmlDocument for a few reasons. The biggest reason is that you can model your data with .NET classes and you get strong-typing! For example, System.Xml.XmlAttribute.Value is always a string, but sometimes the data in your Xml file is not. In your case it happens to be however.

So here is what I would do. Add these classes which define your data model

<XmlRoot>
Public Class dmodule
    <XmlElement("proceduralStep")>
    Public Property proceduralSteps As List(Of proceduralStep)
End Class

Partial Public Class proceduralStep
    <XmlElement("proceduralStep")>
    Public Property proceduralSteps As List(Of proceduralStep)
    <XmlAttribute>
    Public Property id As String
    <XmlElement>
    Public Property para As para
End Class

Public Class para
    <XmlText>
    Public Property Description As String
    <XmlElement>
    Public Property internalRef As internalRef
End Class

Public Class internalRef
    <XmlAttribute>
    Public Property internalRefId As String
    <XmlAttribute>
    Public Property internalRefTargetType As String
End Class

With these, you can deserialize the xml into strongly-typed .NET objects in memory whose properties can be iterated and modified (instead of passing strings into doc.SelectNodes("/dmodule/proceduralStep"))

Now you can deserialize (file >> memory) and serialize (memory >> file).

Dim myDmodule As dmodule
Dim serializer As New XmlSerializer(GetType(dmodule))

' read to memory
Using sr As New StreamReader("C:\Test\34 XML Parsing\XML File\CascadingStepsExample.xml")
    myDmodule = CType(serializer.Deserialize(sr), dmodule)
End Using

' write to file
Using sw As New StreamWriter("C:\Test\34 XML Parsing\XML File\CascadingStepsExample_inc.xml")
    serializer.Serialize(sw, myDmodule)
End Using

You can add some functions to recursively find all id attributes and increment them

' increment function
Public Shared Function incrementId(id As String) As String
    Return id & "incremented" ' how do you REALLY increment this?
End Function
' recursive id finder and incrementer method
Public Shared Sub incrementIds(steps As IEnumerable(Of proceduralStep))
    For Each s In steps
        If Not String.IsNullOrEmpty(s.id) Then
            s.id = incrementId(s.id)
        End If
        incrementIds(s.proceduralSteps)
    Next
End Sub

Just call the function after you have deserialized to your model and serialize back to the file.

Dim myDmodule As dmodule
Dim serializer As New XmlSerializer(GetType(dmodule))

Using sr As New StreamReader("C:\Test\34 XML Parsing\XML File\CascadingStepsExample.xml")
    myDmodule = CType(serializer.Deserialize(sr), dmodule)
End Using

' increment recursively
incrementIds(myDmodule.proceduralSteps)

Using sw As New StreamWriter("C:\Test\34 XML Parsing\XML File\CascadingStepsExample_inc.xml")
    serializer.Serialize(sw, myDmodule)
End Using

Still, we're missing the increment logic, so you need to come up with it, and again, we can help with that.

Upvotes: 0

Related Questions