NoviceMe
NoviceMe

Reputation: 3256

Find and replace text inside xml document using regular expression

I am using c# console app to get xml document. Now once xmldocument is loaded i want to search for specific href tag:

href="/abc/def

inside the xml document.

once that node is found i want to strip tag completly and just show Hello.

<a href="/abc/def">Hello</a>

I think i can simply get the tag using regex. But can anyone please tell me how can i remove the href tag completly using regex?

Upvotes: 1

Views: 3077

Answers (3)

Jive Boogie
Jive Boogie

Reputation: 1265

You could try

string x = @"<?xml version='1.0'?> 
 <EXAMPLE>  
    <a href='/abc/def'>Hello</a> 
 </EXAMPLE>";

 System.Xml.XmlDocument doc = new XmlDocument();
 doc.LoadXml(x);
 XmlNode n = doc.SelectSingleNode("//a[@href='/abc/def']");
 XmlNode p = n.ParentNode;
 p.RemoveChild(n);
 System.Xml.XmlNode newNode = doc.CreateNode("element", "a", "");
 newNode.InnerXml = "Hello";
 p.AppendChild(newNode);

Not really sure if this is what you are trying to do but it should be enough to get you headed in right direction.

Upvotes: 0

Jirka Hanika
Jirka Hanika

Reputation: 13529

The most popular technology for similar tasks is called XPath. (It is also a key component of XQuery and XSLT.) Would the following perhaps solve your task, too?

root.SelectSingleNode("//a[@href='/abc/def']").InnerText = "Hello";

Upvotes: 0

Jason Meckley
Jason Meckley

Reputation: 7591

xml & html same difference: tagged content. xml is stricter in it's formatting. for this use case I would use transformations and xpath queries rebuild the document. As @Yahia stated, regex on tagged documents is typically a bad idea. the regex for parsing is far to complex to be affective as a generic solution.

Upvotes: 1

Related Questions