sathishkumar
sathishkumar

Reputation: 347

Removing [<] charecter from the particular tag in the xml using regex or shell

I'm new to regex that would be great if you can throw some light on this.

I have a big xml file which has around 50k lines constructed by some third party tool.

In which I have lines like this

<title>Apache 2.2 < 2.2.28 Multiple Vulnerabilities</title>

I just want remove the < inside this title tag in whole xml.

I tried some patterns in vim and sed but no luck.

Upvotes: 0

Views: 51

Answers (1)

Kenney
Kenney

Reputation: 9093

In vim you can do this:

:%s@\(<title>.*\)<\(.*</title>\)@\1\2@

(where % means 'the entire file', and \1 and \2 are back-references to the \(..\) parts of the expression)

Or, better yet:

:%s@\(<title>.\{-}\)<\(.\{-}</title>\)@\1\2@

(the \{-} is the non-greedy version of *).

However, I'm assuming that you want to replace the < because it is illegal XML syntax, you could replace it with &amp; like so:

:%s@\(<title>.\{-}\)<\(.\{-}</title>\)@\1\&amp;\2@

Upvotes: 3

Related Questions