Reputation: 243
I'm writing a unix shell script where I need to pretty print XML files, but the catch is that there are portions of them that I may not touch. Namely, they're Apache Jelly scripts, which are contained within the XML files I need to pretty print. So I need to convert this
<proc source="customer"><scriptParam value="_user"/><scriptText><jelly:script>
<jelly:log level="info">
this text needs
to keep its indent level
and this is none of my business
</jelly:log>
<!-- get date -->
<sql:query var="rs"><![CDATA[
select sysdate
from dual
]]></sql:query>
</jelly:script>
</scriptText></proc>
Into this
<proc source="customer">
<scriptParam value="_user"/>
<scriptText>
<jelly:script>
<jelly:log level="info">
this text needs
to keep its indent level
and this is none of my business
</jelly:log>
<!-- get date -->
<sql:query var="rs"><![CDATA[
select sysdate
from dual
]]></sql:query>
</jelly:script>
</scriptText>
</proc>
Notice that the only change to the jelly:script
element is newline
before it.
I couldn't find any option in xmllint
or xmlstarlet
to ignore a
certain element. Is there any tool that can help me achieve this? I'm on
Linux, if it matters.
Upvotes: 1
Views: 1183
Reputation: 335
When requirement is that inside element jelly:script no spaces may change, then you can use xml_pp
(on linux installed with the perl package perl-XML-Twig
. The option -p some-element
can be used to preserve all whitespace inside those elements:
xml_pp -p jelly:script thefile.xml
That will create this:
<proc source="customer">
<scriptParam value="_user"/>
<scriptText>
<jelly:script>
<jelly:log level="info">
this text needs
to keep its indent level
and this is none of my business
</jelly:log>
<!-- get date -->
<sql:query var="rs"><![CDATA[
select sysdate
from dual
]]></sql:query>
</jelly:script>
</scriptText>
</proc>
As you can see the start element <jelly:script>
is also indented, because added spaces are still outside the element.
If that is also forbidden, then you must choose one level higher (scriptText
), or maybe pipe it to a command that remove those spaces again:
xml_pp -p jelly:script thefile.xml | perl -pe 's/^\s*(<jelly:script>)/$1/'
Upvotes: 1