Reputation: 1
I am trying to remove a particular property from a HTML string.
Here is my sample HTML string.
<span lang=EN-GB style='font-size:10.0pt;line-height:115%;font-family:"Tahoma","sans-serif";color:#17365D'>Thank you</span>
Is there any way to remove the line-height:115%; property from the string, which would have provide me the output as below by using Regex in C#.net?
<span lang=EN-GB style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:#17365D'>Thank you</span>
I have tried with this Regex, but it just removed all of the style attribute, but what I am trying to achieve here is to remove only the line-height property.
Regex.Replace(html, @"<([^>]*)(?:style)=(?:'[^']*'|""[^""]*""|[^\s>]+)([^>]*)>", "<$1$2>", RegexOptions.IgnoreCase);
I just need to match the line-height property in the style attribute without caring about the value it has and remove the whole line till the end of semicolon(;). Any help would be greatly appreciated. Thanks.
Upvotes: 0
Views: 1209
Reputation: 1
thanks everyone for your kind suggestion. I have figured out a Regex for this situation. Here's it if anyone is interested. Thank you.
html = Regex.Replace(html, @"line-height:[^;]+;", "", RegexOptions.IgnoreCase);
Upvotes: 0
Reputation: 47784
You could try using HtmlAgilityPack for this instead of using Regex.
Excuse me for the below example is a lil messy(but works) just to give you an idea of this.
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml("<span lang=EN-GB style='font-size:10.0pt;line-height:115%;font-family:\"Tahoma\",\"sans-serif\";color:#17365D'>Thank you</span>");
foreach (var item in doc.DocumentNode.Descendants("span"))
{
var temp = item.Attributes["style"];
var styles = temp.Value.Split(';').ToList();
var newStyleList = styles.Where(m => !m.Contains("line-height:115%")).ToList();
string newStyle = string.Empty;
foreach (var style in newStyleList)
{
newStyle += style + ";";
}
}
Upvotes: 1