Neil
Neil

Reputation: 6039

Regex to match starting node of XML without the tag name

I have XML like

<A>
 <B>
  <C>
   Hello World
  </C>
 </B>
</A>

I want to replace the starting tag's "<" with "<ns:" in each tag resulting the following XML

<ns:A>
 <ns:B>
  <ns:C>
   Hello World
  </ns:C>
 </ns:B>
</ns:A>

What should be the regular expression I should use in text editor to replace and include the namesapace

I was trying to use regex [<][^/] but it selects the 1st character of the starting tag also which I don't want to replace.

Note: I have the above requirement for manual editing purpose in an editor where regex replace is supported. I was not going to do the above task programmatically .And the output XML fragment requested is an inner part of the complete XML , hence namespace URI is not mentioned

Upvotes: 2

Views: 806

Answers (3)

Michael Kay
Michael Kay

Reputation: 163595

The output you have requested is not well-formed XML (it has an undeclared namespace prefix), and the regex you have been given will produce that output, which is useless, because no tool that expects XML will be able to handle it. This is one of the reasons why processing XML with regular expressions is such a bad idea. Use XML processing tools such as XSLT instead, every time.

Upvotes: 0

bua
bua

Reputation: 4880

cat tf
<A>
 <B>
  <C>
   Hello World
  </C>
 </B>
</A>



[user@serv:~/] cat tf | sed 's/\w*<\([\/]*\)/<\1ns:/'

<ns:A>
 <ns:B>
  <ns:C>
   Hello World
  </ns:C>
 </ns:B>
</ns:A>

Upvotes: 1

Bohemian
Bohemian

Reputation: 425318

Use this regex replacement:

regex: (</?)
replacement: $1ns:

Upvotes: 2

Related Questions