Reputation: 1331
In need of a regex master here!
<img src="\img.gif" style="float:left; border:0" />
<img src="\img.gif" style="border:0; float:right" />
Given the above HTML, I need a regex pattern that will match "float:right" or "float:left" but only on an img tag.
Thanks in advance!
Upvotes: 0
Views: 4113
Reputation: 9332
I agree with Sean Nyman, it's best not to use a regex (at least not for anything permanent). For something ad-hoc and a bit more durable, you might try:
/<img\s(?:\s*\w+\s*=\s*(?:'[^']*'|"[^"]*"))*?\s*\bstyle\s*=\s*(?:"[^"]*?\bfloat\s*:\s*(\w+)|'[^']*?float\s*:\s*(\w+)/i
Upvotes: 0
Reputation: 4470
You really shouldn't use regex to parse html or xml, it's impossible to design a foolproof regex that will handle all corner cases. Instead, I would suggest finding an html-parsing library for your language of choice.
That said, here's a possible solution using regex.
<img\s[^>]*?style\s*=\s*".*?(?<"|;)(float:.*?)(?=;|").*?"
The "float:" will be captured in the only capturing group there, which should be number 1.
The regex basically matches the start of an img tag, followed by any type of character that isn't a close bracket any number of times, followed by the style attribute. Within the style attribute's value, the float: can be anywhere within the attribute, but it should only match the actual float style (i.e. it's preceded by the start of the attribute or a semicolon and followed by a semicolon or the end of the attribute).
Upvotes: 2
Reputation: 124365
/<img\s[^>]*style\s*=\s*"[^"]*\bfloat\s*:\s*(left|right)[^"]*"/i
Have to advise you, though: in my experience, no matter what regex you write, someone will be able to come up with valid HTML that breaks it. If you really want to do this in a general, reliable way, you need to parse the HTML, not throw regexes at it.
Upvotes: 4