Andrey Braslavskiy
Andrey Braslavskiy

Reputation: 221

Remove xml tag that contains specific value with sed

I have configuration file

<configuration>
 <property>
    <name>name1</name>
    <value>value1</value>
    <description>desc1</description>
</property>
 <property>
    <name>name2</name>
    <value>valueToRemove</value>
    <description>desc2</description>
 </property>
 <property>
    <name>name3</name>
    <value>value3</value>
    <description>desc3</description>
 </property>
 <property>
    <name>name3</name>
    <value>valueToRemove</value>
    <description>desc4</description>
 </property>
 <property>
    <name>name5</name>
    <value>valu5</value>
 </property>
</configuration>

I want to remove all property tags that contains value valueToRemove.

I want next output

<configuration>
 <property>
    <name>name1</name>
    <value>value1</value>
    <description>desc1</description>
</property>
 <property>
    <name>name3</name>
    <value>value3</value>
    <description>desc3</description>
 </property>
 <property>
    <name>name5</name>
    <value>valu5</value>
 </property>
</configuration>

Next bash script removes only lines with value tags.

sed -i "/[<property>].*valueToRemove.*[<\/property]>/d"  "test"

Help me, please, I am very new to bash and regex.

Upvotes: 0

Views: 229

Answers (1)

Marcus M&#252;ller
Marcus M&#252;ller

Reputation: 36482

Never parse XML with regexes. They are just the wrong tools for XML and its variants.

Really, having an XML parser isn't hard nowadays. There's plenty of libraries and tools to do so; especially if you're new to bash programming, why would you use that?

The only difference from a OS perspective from a bash script and e.g. a python script is their first line,

#!/path/to/program/that/will/interpret/this/script

and for you it's that you could use any script language, e.g. python, that has a good XML library. With python and lxml, it's a few lines of code, and you'd be pretty sure that even the least sed-understandable XML comes out proper, as long as it's valid XML.

tl;dr: Don't use regexes/sed to parse XML. Use an XML parser. Bash is just a script interpreter, and there are far more potent scripting languages to deal with such tasks.

Upvotes: 1

Related Questions