Terminador
Terminador

Reputation: 1398

Clean font-size tag with Regex

I try to use Regex to clean <font style="font-size:85%;font-family:arial,sans-serif"> from

font-size:85%;

My Regex is ^font-size:(*);

I mean I have to delete font-size tag completly.

Can someone help me pls?

Thank you!

Upvotes: 3

Views: 3915

Answers (2)

Nikola Malešević
Nikola Malešević

Reputation: 1858

This is the regex you will need:

string html = @"<font style=""font-size:85%;font-family:arial,sans-serif"">";
string pattern = @"font-size\s*?:.*?(;|(?=""|'|;))";
string cleanedHtml = Regex.Replace(html, pattern, string.Empty);

This regex will work even if the font-size is defined in pt or em, or if there is a different set of CSS styles defined (ie. font-family not specified). You can see the results here.

The explanation of the regex follows:

// font-size\s*?:.*?(;|(?="|'|;))
// 
// Match the characters “font-size” literally «font-size»
// Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the character “:” literally «:»
// Match any single character that is not a line break character «.*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the regular expression below and capture its match into backreference number 1 «(;|(?="|'|;))»
//    Match either the regular expression below (attempting the next alternative only if this one fails) «;»
//       Match the character “;” literally «;»
//    Or match regular expression number 2 below (the entire group fails if this one fails to match) «(?="|'|;)»
//       Assert that the regex below can be matched, starting at this position (positive lookahead) «(?="|'|;)»
//          Match either the regular expression below (attempting the next alternative only if this one fails) «"»
//             Match the character “"” literally «"»
//          Or match regular expression number 2 below (attempting the next alternative only if this one fails) «'»
//             Match the character “'” literally «'»
//          Or match regular expression number 3 below (the entire group fails if this one fails to match) «;»
//             Match the character “;” literally «;»

Upvotes: 3

Oded
Oded

Reputation: 499212

Several things with your current regex will cause it to fail:

^font-size:(*);

You are anchoring to the start of the line ^ - the attribute is not at the start of the line.

* on its own means nothing.

Change it to:

font-size: ?\d{1,2}%;

Upvotes: 6

Related Questions