jkupczak
jkupczak

Reputation: 3021

Regex to remove HTML attribute from any HTML tag (style="")?

I'm looking for a regex pattern that will look for an attribute within an HTML tag. Specifically, I'd like to find all instances of ...

style=""

... and remove it from the HTML tag that it is contained within. Obviously this would include anything contained with the double quotes as well.

I'm using Classic ASP to do this. I already have a function setup for a different regex pattern that looks for all HTML tags in a string and removes them. It works great. But now I just need another pattern for specifically removing all of the style attributes.

Any help would be greatly appreciated.

Upvotes: 18

Views: 40734

Answers (9)

foad abdollahi
foad abdollahi

Reputation: 1978

try it:

(style|class)=(["'])(.*?)(["'])

Upvotes: 1

PersyJack
PersyJack

Reputation: 1964

In visual studio find and replace, this is what i do to remove style and class attributes:

\s*style|class="[^"]*\n*"

This removes the beginning spaces and style and class attributes. It looks for anything except a double quote in these attributes and then newline(s), in case if it spreads out to new lines, and lastly adds the closing double quote

Upvotes: 4

indextwo
indextwo

Reputation: 5905

The following expression should remove anything within a style attribute (including the attribute itself); crucially this includes whether the attribute uses double or single quotes:

/style=("|')(?:[^\1\\]|\\.)+?\1/gi

This splits the capture groups so that they can match on single or double-quotes, and then capture anything in between, including URL-encoded characters & line breaks, whilst leaving other attributes (like classes or names) intact.

Tested here: https://regexr.com/4rovf

Upvotes: 0

Sachin Gaur
Sachin Gaur

Reputation: 727

Try this, it will replace style attribute and it's value completely

const regex = /style="(.*?)"/gm;
const str = `<div class="frame" style="font-family: Monaco, Consolas, &quot;Courier New&quot;, monospace; font-size: 12px; background-color: rgb(245, 245, 245);">some text</div>`;
const subst = ``;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

Upvotes: 5

Nataly S&#39;omka
Nataly S&#39;omka

Reputation: 11

I tried Jason Gennaro's regular expression and slightly modified it

/style="[a-zA-Z0-9:;&\."\s\(\)\-\,]*|\\/ig

This regular expression captures some specific cases with &quot inside the string for example

 <div class="frame" style="font-family: Monaco, Consolas, &quot;Courier New&quot;, monospace; font-size: 12px; background-color: rgb(245, 245, 245);">some text</div>

Upvotes: 1

Dmitry Matrosov
Dmitry Matrosov

Reputation: 420

This expression work for me:

style=".+"/ig

Upvotes: 0

CpILL
CpILL

Reputation: 6989

Perhaps a simpler expression is

 style="[^\"]*"

so everything between the double quotes except a double quote.

Upvotes: 54

Jason Gennaro
Jason Gennaro

Reputation: 34855

I think this might do it:

/style="[a-zA-Z0-9:;\.\s\(\)\-\,]*"/gi

You could also put these in capturing groups, if you wanted to replace some parts only

/(style=")([a-zA-Z0-9:;\.\s\(\)\-\,]*)(")/gi

Working Example: http://regexr.com?2up30

Upvotes: 25

FailedDev
FailedDev

Reputation: 26930

This works with perl. Maybe you need to change the regex to match ASP rules a little bit but it should work for any tag.

$file=~ s/(<\s*[a-z][a-z0-9]*.*\s)(style\s*=\s*".*?")([^<>]*>)/$1 $3/sig;

Where line is an html file.

Also this is in .net C#

      string resultString = null;
      string subjectString = "<html style=\"something\"> ";

      resultString = Regex.Replace(subjectString, @"(<\s*[a-z][a-z0-9]*.*\s)(style\s*=\s*"".*?"")([^<>]*>)", "$1 $3", RegexOptions.Singleline | RegexOptions.IgnoreCase);

Result : <html >

Upvotes: 0

Related Questions