Reputation: 2559
I need to find and replace the value of the specific xml element. The conditions are as follows:
My test xml looks like this:
<somenode name="node1">
<some></some>
<enabled>0</enabled>
<some></some>
</somenode>
<someothernode name="node2">
<some></some>
<enabled>0</enabled>
<some></some>
</someothernode>
<somenode name="node3">
<some></some>
<enabled>0</enabled>
<some></some>
</somenode>
I expect that first and third enabled elements would be changed. So far I have managed to write this sed command:
sed -n "1h;1!H;${;g;s|\(<somenode [^>]*>\)\(.*\)\(<enabled>\s*\)0\(\s*</enabled>\)\(.*</somenode>\)|\1\2\3 1 \4\5|g;p;}" test.xml
but it changes only the last one, and I believe it is due to greedy match. Any help would be appreciated.
Upvotes: 2
Views: 9942
Reputation: 881293
Forget sed
for complex multi-line processing. Seriously.
If you're not willing to use a proper XML tool, at least use a standard string processing tool that has proper branching statements :-)
If you can guarantee your file is formatted in the way you have it, you can use something like:
pax> echo '<somenode name="node1">
<some></some>
<enabled>0</enabled>
<some></some>
</somenode>
<someothernode name="node2">
<some></some>
<enabled>0</enabled>
<some></some>
</someothernode>
<somenode name="node3">
<some></some>
<enabled>0</enabled>
<some></some>
</somenode>
' | awk '
BEGIN {s = 0}
/^<somenode / {s=1}
/^<\/somenode>/ {s=0}
/^ <enabled>0<\/enabled>/ {if (s==1) {$0=" <enabled>1</enabled>"}}
{print}
'
to get:
<somenode name="node1">
<some></some>
<enabled>1</enabled>
<some></some>
</somenode>
<someothernode name="node2">
<some></some>
<enabled>0</enabled>
<some></some>
</someothernode>
<somenode name="node3">
<some></some>
<enabled>1</enabled>
<some></some>
</somenode>
The trouble with that sort of method is that it doesn't handle what may be perfectly valid XML files. This particular version has certain limitations such as:
That's why it's better to use a tool built specifically for the job. But, if you just want a quick hack and the file format is under your control, it's probably okay to use the awk
(or perl
or python
or your other quick-and-dirty scripting tool of choice).
Upvotes: 2
Reputation: 41
Use xmlstarlet if possible:
echo '
<root>
<somenode name="node1">
<some></some>
<enabled>0</enabled>
<some></some>
</somenode>
<someothernode name="node2">
<some></some>
<enabled>0</enabled>
<some></some>
</someothernode>
<somenode name="node3">
<some></some>
<enabled>0</enabled>
<some></some>
</somenode>
</root>
' > testfile.xml
xml val testfile.xml
xml el -v testfile.xml
xml ed --help
# version 1
xml ed -u "//somenode[1]/enabled" -v '1' \
-u "//somenode[2]/enabled" -v '1' \
testfile.xml
# version 2 (-L for in-place editing; xmlstarlet v1.0.2)
xml ed -L -u "//somenode[@name='node1']/enabled" -v '1' \
-u "//somenode[@name='node3']/enabled" -v '1' \
testfile.xml
Upvotes: 4
Reputation: 10188
Other people have already explained why it is generally not a good idea to process XML with regular expressions.
With all that in mind, here's the sed
program to substitute text matching foo with bar between lines matching start and end (inclusively):
/start/,/end/s/foo/bar/
Upvotes: 2
Reputation: 342323
your requirement is quite simple as seen from your description, therefore there's no need to use XML parsers/tools, if you don't want to. you can use just the shell(or other shell tools you may prefer)
#!/bin/bash
while read -r line
do
case "$line" in
*"<someothernode"* ) flag=0;;
*"<somenode"* )flag=1;;
esac
if [ "$flag" -eq "1" ] ;then
case "$line" in
*"<enabled"* )
echo "${line/<enabled>0/<enabled>1}"
;;
*) echo $line;
esac
else
echo $line
fi
done < "file"
Upvotes: -1
Reputation: 123801
You seems need to loop something with sed
http://www.rtfiber.com.tw/~changyj/sed/html/p.20070613a.html
I still can't figure out though, just for your information.
Upvotes: 0
Reputation: 342323
you can use gawk
awk -vRS= '/somenode/{
$0=gensub("(.*<enabled>)([01])(</enabled>.*)", "\\11\\3","g",$0)
}1' file
output
$ ./shell.sh
<somenode name="node1">
<some></some>
<enabled>1</enabled>
<some></some>
</somenode>
<someothernode name="node2">
<some></some>
<enabled>0</enabled>
<some></some>
</someothernode>
<somenode name="node3">
<some></some>
<enabled>1</enabled>
<some></some>
</somenode>
Upvotes: 0
Reputation: 38033
It is generally a poor idea to try to use regexes to parse XML. See previous discussion such as Parsing XML with REGEX in Java. (Actually your XML is not well-formed since it does not have exactly one root element). There are many different (free) XML engines for parsing and manipulating XML in almost every language and I'd recommend you use one of those.
Upvotes: 4