thordijk
thordijk

Reputation: 15

Parse XML based on node attribute and edit when found

I am trying to write a VBA script that works as a macro for an Excel file and translates this to an XML file. This I have been able to achieve however it needs to check its own data on the fly to prevent double notations. The data is sorted based on a marker called a tagname. There exists the possibility that a certain tagname exist more than once.

What I would like to be able to do is parse the XML document that is being created based on this tagname and when a node with this tagname is found, the additional information should be appended to the info attribute of this node.

Currently, the XML hat is being produced looks like this:

<Section name="GS.0101PS02">
   <Translation key="Info">=GS+MCC-151B9 Filterbewaking</Translation>
</Section>
<Section name="GS.0102PS02">
   <Translation key="Info">=GS+MCC-152B3 Filterbewaking</Translation>
</Section>
<Section name="GS.0025LS01">
   <Translation key="Info">=GS+MCC-161B5 Niveau opvangbak</Translation>
</Section>
<Section name="GS.0026LS01">
   <Translation key="Info">=GS+MCC-161B15 Niveau opvangbak</Translation>
</Section>
<Section name="GS.0300PS02">
   <Translation key="Info">=GS+MCC-162B11 Filterbewaking</Translation>
</Section>
<Section name="GS.0141AV05">
  <Translation key="Info">=GS+CC1-150B3 Bunker 1 Eindklep Open <br/> =GS+CC1-150B5 Bunker 1 Eindklep Dicht</Translation>
</Section>

The last line is an example of the tagname that exists multiple times throughout the document and this is also how it should be formatted. Currently the code is only capable of doing this when the tagname repeats itself in the next row, however, this is not a given.

I have attempted realizing this using selectnodes() and selectsinglenode() and a couple other approached but have been unsuccessful the main reason I can think of is that my my XPath if formatted improperly.

The script I am running is as follows:

Sub RPCTranslatesCombinedInfoBackwardsChecking()

    Set oXMLDoc = CreateObject("MSXML2.DOMDocument")                                   'Create the XML document'
    Set oPI = oXMLDoc.createProcessingInstruction("xml", "version=""1.0"" encoding=""UTF-8""") 'Define the processing instructions'
    Set oRoot = oXMLDoc.CreateNode(1, "Translations", "urn:Riwo.Pcs.Localization")     'Create the root node "Translations" in the name space urn:Riwo.Pcs.Localization'
    
    Set xsi_Attribute = oXMLDoc.CreateAttribute("xmlns:xsi")                           'Add the first attribute to the namespace'
    xsi_Attribute.Value = "http://www.w3.org/2001/XMLSchema-instance"
    oRoot.Attributes.setNamedItem (xsi_Attribute)
    
    Set xsd_Attribute = oXMLDoc.CreateAttribute("xmlns:xsd")                           'Add the Second attribute to the namespace'
    xsd_Attribute.Value = "http://www.w3.org/2001/XMLSchema"
    oRoot.Attributes.setNamedItem (xsd_Attribute)
    
    Set code_Attribute = oXMLDoc.CreateAttribute("code")                               'Add the Third attribute to the namespace'
    code_Attribute.Value = "nl"
    oRoot.Attributes.setNamedItem (code_Attribute)
    
    Set description_Attribute = oXMLDoc.CreateAttribute("description")                 'Add the Fourth attribute to the namespace'
    description_Attribute.Value = "Dutch"
    oRoot.Attributes.setNamedItem (description_Attribute)
    
    oXMLDoc.AppendChild oRoot                                                          'Append the above defined node to the document'
    oXMLDoc.InsertBefore oPI, oXMLDoc.ChildNodes.Item(0)
    
    With ActiveSheet

        lRow = 2                                                                          'Start the macro operation at the second row of the active worksheet'
        sSectionName = ""                                                                 'Initialize the sectionname string'
        
        Do While .Cells(lRow, 4).Value <> ""                                              'While there is a value in the fourth column, starting at row 2'
    
            sLineName = .Cells(lRow, 1).Value                                                'Define a prefix for the section names'
            sSectionPrefix = Right(sLineName, Len(sLineName) - 1)
            
            Passes = 0
            
            sSectionName = .Cells(lRow, 4).Value                                             'Define the value of row x in column 4 as the section name'
            Set oElmSection = oXMLDoc.CreateNode(1, "Section", "urn:Riwo.Pcs.Localization")  'Create the section node'
            oXMLDoc.DocumentElement.AppendChild oElmSection                                  'Append the section node to the document'
            Set oAttr = oXMLDoc.CreateNode(2, "name", "urn:Riwo.Pcs.Localization")           'Add the name attribute to the section node'
            NodeName = sSectionPrefix & "." & sSectionName
            oAttr.NodeValue = NodeName                                                      'Define the name of the section node as being the prefix as well as the section name from the fourth column'
            oElmSection.SetAttributeNode oAttr                                               'Set the name attribute as an attribute of section'
                       
            ExistingNodes = oXMLDoc.SelectNodes("//Section").Attributes.getNamedItem([@name=NodeName]).Text
           
            
            Do While .Cells(lRow, 4).Value = sSectionName                                      'For all sections with the same section name do:'
            
               sInfoDescription_1 = .Cells(lRow, 1).Value                                      'Fetch the info data for the respective row'
               sInfoDescription_2 = .Cells(lRow, 2).Value
               sInfoDescription_3 = .Cells(lRow, 3).Value
               sInfoDescription_5 = .Cells(lRow, 5).Value
            
               If Passes = 0 Then                                                              'Check if first pass of the section"
                   
                   sInfo = sInfoDescription_1 & sInfoDescription_2 & sInfoDescription_3 & " " & sInfoDescription_5 'Combine info in 1 string'
                   Set oElmTranslation = oXMLDoc.CreateNode(1, "Translation", "urn:Riwo.Pcs.Localization") 'Create Translation node'
                   Set oAttr = oXMLDoc.CreateNode(2, "key", "urn:Riwo.Pcs.Localization")       'Add key attribute'
                   oAttr.NodeValue = "Info"                                                    'Define Info as key'
                   oElmTranslation.SetAttributeNode oAttr                                      'Set key attribute in Translation node'
                   oElmTranslation.AppendChild oXMLDoc.createTextNode(sInfo)                   'Use info text as info'
                   oElmSection.AppendChild oElmTranslation                                     'Append Translation node to section'
                   Passes = 1
                   lRow = lRow + 1                                                             'Procede to next row'
               
               Else                                                                            'Second or more passes over a section'
               
                   sInfo = " <br/> " & sInfoDescription_1 & sInfoDescription_2 & sInfoDescription_3 & " " & sInfoDescription_5           'Combine info in 1 string'
                   oElmTranslation.AppendChild oXMLDoc.createTextNode(sInfo)                   'Use info text as info'
                   lRow = lRow + 1                                                             'Procede to next row'
                   
               End If
            Loop                                                                             'Loop second while'
        Loop                                                                              'Loop first while'
    End With

    MsgBox oXMLDoc.XML                                                                 'Show result'
    oXMLDoc.Save "C:\Users\thomas.RIWO\Desktop\Translations\RPC test\test2.xml"        'Save the xml file'

End Sub

What I would like to know is how to retrieve an existing node from my xml and edit this.

Upvotes: 1

Views: 195

Answers (1)

Tim Williams
Tim Williams

Reputation: 166980

Here's a suggested approach using a scripting dictionary to track what sections have already been added, so we can easily get a reference to the required element to which the description needs to be added/appended.

Option Explicit

Sub RPCTranslatesCombinedInfoBackwardsChecking()

    Const THE_NS As String = "urn:Riwo.Pcs.Localization"

    Dim oXMLDoc As Object, oRoot As Object, dict As Object, sSectionPrefix As String
    Dim sSectionName As String, sDescription As String, txt, oPI As Object, lRow As Long
    Dim oElmSection As Object, dictKey As String, oElmTranslation As Object

    Set dict = CreateObject("scripting.dictionary")
    Set oXMLDoc = CreateObject("MSXML2.DOMDocument")
    Set oPI = oXMLDoc.createProcessingInstruction("xml", "version=""1.0"" encoding=""UTF-8""")

    'create the root element
    Set oRoot = CreateWithAttributes(oXMLDoc, "Translations", THE_NS, "", _
                   Array("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance", _
                         "xmlns:xsd", "http://www.w3.org/2001/XMLSchema", _
                         "code", "nl", _
                         "description", "Dutch")) '

    oXMLDoc.appendchild oRoot

    lRow = 2

    With ActiveSheet
        Do While .Cells(lRow, 4).Value <> ""

            sSectionPrefix = Right(.Cells(lRow, 1).Value, Len(.Cells(lRow, 1).Value) - 1)
            sSectionName = .Cells(lRow, 4).Value
            dictKey = sSectionPrefix & "." & sSectionName

            'first time seeing this section name? - add to document and to dictionary
            If Not dict.exists(dictKey) Then
                Set oElmSection = CreateWithAttributes(oXMLDoc, "Section", THE_NS, "", _
                                                       Array("name", dictKey))
                oRoot.appendchild oElmSection

                Set oElmTranslation = CreateWithAttributes(oXMLDoc, "Translation", THE_NS, "", _
                                                       Array("key", "Info"))
                oElmSection.appendchild oElmTranslation
                Set dict(dictKey) = oElmTranslation 'store reference in dictionary
            End If

            Set oElmTranslation = dict(dictKey) 'get the node to add content

            sDescription = .Cells(lRow, 1).Value & " " & .Cells(lRow, 2).Value & " " & _
                          .Cells(lRow, 3).Value & " " & .Cells(lRow, 5).Value


            txt = oElmTranslation.Text

            If Len(txt) > 0 Then txt = txt & "[br]"
            txt = txt & sDescription

            oElmTranslation.NodetypedValue = txt

            lRow = lRow + 1
        Loop
    End With

    Debug.Print oXMLDoc.XML
    oXMLDoc.Save "C:\Users\thomas.RIWO\Desktop\Translations\RPC test\test2.xml"

End Sub

Having previously done a bunch of XML creation, I find it's much easier to see what you're doing if you factor out the repetitive parts into a separate utility method such as the one shown below. It's quite basic, but covers a lot of what you need to do while constructing an XML document.

'Utility method: create and return an element, with 
'   optional namespace, value and attributes
Function CreateWithAttributes(doc As Object, elName As String, _
    elNameSpace As String, elValue As String, Optional attr As Variant = Empty)
    Dim el, u, i As Long, att As Object, txt As Object
    'create the node
    Set el = doc.CreateNode(1, elName, elNameSpace)
    'if have attributes, loop and add
    If Not IsEmpty(attr) Then
        For i = 0 To UBound(attr) Step 2
            Set att = doc.CreateAttribute(attr(i))
            att.Value = attr(i + 1)
            el.Attributes.setNamedItem att
        Next i
    End If
    'any element content to add?
    If Len(elValue) > 0 Then
        Set txt = doc.createTextNode(elValue)
        el.appendchild txt
    End If
    Set CreateWithAttributes = el
End Function

Upvotes: 1

Related Questions