Regex for Removing tag text that is between
and
C#

Question

I have the following html, i tried many many regex to remove hperlink content/text that is between ul tag and li tag only, but not found any regex for removing a tag text . I want that , whenever a tag comes under in ul and li tag then i want to replace a tag text with empty string.


 About
 Apps

i have tried this regex but it is not working, here input is string that contains html.

input = Regex.Replace(input, @"]*?>]*?>(?", string.Empty);

Please help me out. Thank You

Anirudha · Accepted Answer

Regex is not a good choice for parsing HTML files..

HTML is not strict nor is it regular with its format..

Use htmlagilitypack

Regex is used for Regular expression

You can use this code to retrieve it using HtmlAgilityPack

HtmlDocument doc = new HtmlDocument();
doc.Load(yourStream);

foreach(var item in doc.DocumentNode.SelectNodes("//li[a]"))// select li only if it has anchor tag
{
    item.ParentNode.RemoveChild(item);//removed anchor tag
}
//dont forget to save

i want to remove tag text using regex only ..

Regex.Replace(input,@"(?<=]*>)\s*)","",RegexOptions.Singleline);

Regex for Removing <a> tag text that is between <ul> and <li> C#

Answers (2)

Related Questions

Regex for Removing &lt;a&gt; tag text that is between &lt;ul&gt; and &lt;li&gt; C#

Answers (2)

Related Questions

Regex for Removing <a> tag text that is between <ul> and <li> C#