Lucian
Lucian

Reputation: 115

TCL: how do I split an XML file by tag

I have an XML file with the following structure:

<?xml version="1.0" encoding="UTF-8"?>
  <header>
    <name>generic_1</name>
  </header>
  <body>
    <resources>
      <resource guid="ae8c34ad-a4e6-47fe-9b7d-cd60223754fe">
      </resource>
      <resource guid="fe236467-3df5-4019-9d55-d4881dfabae7">
      </resource>
    </resources>
  </body>

I need to edit the information of each resource so I tried to split the file by the string </resource> but TCL doesn't split it properly.

This is what I tried: split $file "</resource>". I also tried escaping the <, / and > characters but still no success.

Can you please help me with an elegant solution? I can do it by taking each line and determining where the resource ends, but a split would be nicer, if it can be done.

LE: I can't use tdom, I am editing the file as a text file, not as a XML file.

Thank you

Upvotes: 1

Views: 629

Answers (2)

Peter Lewerin
Peter Lewerin

Reputation: 13252

This is not an answer, just two additions to mrcalvin's answer, put here for formatting purposes.

First, your XML is invalid, as it lacks a root element (maybe it's snipped out).

Second, you didn't describe in what manner you wanted to edit the nodes. Two obvious ways is to add a new attribute value and to add a new child node. This is how you can select to do each with tdom based on the value of the guid attribute:

set nodes [$root selectNodes //resources/resource]
foreach node $nodes {
    switch [$node getAttribute guid] {
        ae8c34ad-a4e6-47fe-9b7d-cd60223754fe {
            $node setAttribute foo bar
        }
        fe236467-3df5-4019-9d55-d4881dfabae7 {
            $node appendChild [$doc createElement quux]
        }
        default {
            error "unknown resource"
        }
    }
}

If you wish to add something more complex than a child node, there are several ways to do so, including using node commands, appending an XML literal, appending via a script (most useful when several similar additions are made), and appending a nested Tcl list that describes a node structure with attributes.

You can then get the edited DOM structure as XML by calling $doc asXML.

Upvotes: 2

mrcalvin
mrcalvin

Reputation: 3434

Suggestion

XML handling in Tcl has been handled numerous times here. It is generally recommended that you use tdom and XPath expressions to navigate the DOM and extract data:

package req tdom
set doc  [dom parse $xml]
set root [$doc documentElement]
$root selectNodes //resources/resource

Comment

split breaks up a string on a per-character basis. The last argument to split is interpreted as a number of split characters, rather than one split string. Besides, it would not give you what you want.

Upvotes: 4

Related Questions