Reputation: 25
Working with TCL and I am trying to setup a regex to get the data within my xml string. The code that I provided has an example string of what I am dealing with and the regexp is attempting to find the first close bracket and keep the data until the next open bracket then place that into variable number. Unfortunately the output I am getting is: "< RouteLabel>Hurdman<" instead of the expected "Hurdman". Any help would really be appreciated.
set direction(1) {<RouteLabel>Hurdman</RouteLabel>}
regexp {^.*>(.*)<} $direction(1) number
Upvotes: 1
Views: 317
Reputation: 13252
Crash course in tDOM for this exact task:
Get tDOM (note different spelling in package name):
% package require tdom
0.8.3
Create an empty document with a root element called foobar
:
% set doc [dom createDocument foobar]
domDoc02569130
Get a fix on the root:
% set root [$doc documentElement]
domNode025692E0
Setup one of your XML strings:
% set direction(1) {<RouteLabel>Hurdman</RouteLabel>}
<RouteLabel>Hurdman</RouteLabel>
Add it to the DOM tree at the root:
% $root appendXML $direction(1)
domNode025692E0
Get the string you want by XPath expression:
% $root selectNodes {string(//RouteLabel/text())}
Hurdman
Or by querying the root (only works if there is only one single text node inserted at a time, otherwise you get them all concatenated):
% $root asText
Hurdman
If you want to clear the DOM tree from the root to make it ready for appending new strings without the old ones interfering:
% foreach node [$root childNodes] {$node delete}
But if you use XPath expressions you should be able to append any number of XML strings and still retrieve their content.
Once again:
package require tdom
set doc [dom createDocument foobar]
set root [$doc documentElement]
set direction(1) {<RouteLabel>Hurdman</RouteLabel>}
$root appendXML $direction(1)
$root selectNodes {string(//RouteLabel/text())}
# => Hurdman
Documentation: tdom (package)
Upvotes: 1
Reputation: 626738
The issue here is not with the regex but with how you are using it.
The syntax you need is
regexp <PATTERN> <INPUT> <WHOLE_MATCH_VAR> <CAPTURE_1_VAR> ... <CAPTURE_n_VAR>
So, in your case, as you are not interested in the whole match, just put _
where the whole match is expected:
set direction(1) {<RouteLabel>Hurdman</RouteLabel>}
regexp {^.*>(.*)<} $direction(1) _ number
puts $number
printing Hurdman
. See the online Tcl demo.
Upvotes: 1