Reputation: 42139
I have seen this question on SO before, but it was specific to a tag or attribute
I need to match any attribute values with a regex. I have the following, which matches both the attribute and value:
(\S+)=["']?((?:.(?!["']?\\s+(?:\S+)=|[>"']))+.)["']?
But, I only want it to match the value and quotes around the value. It also needs to account for single and double quotes.
I understand the suggestions to avoid doing this with HTML and to use a parser, but this is a specific needed situation. I am only using it to color code the attribute value.
Any help?
Upvotes: 1
Views: 4334
Reputation: 9591
I made a slight mod to your regex string.
I replaced the (\S+)=
with (?<==)
.
I think your regex implementation should be able to do a positive lookbehind.
This regex will show inconsistency when presented with quotes/doublequotes nested inside themselves like this: <a onclick='StackExchange.switchMobile("on")'>mobile</a>
You may want to look into changing your character classes to get around that.
Here's the full regex string:
(?<==)["']?((?:.(?!["']?\\s+(?:\S+)=|[>"']))+.)["']?
As per our online chat discussion, I came up with a new regex which is shorter and much cleaner:
(?<==)('|").*?\1(?=.*?>)
What this regex does is as follows:
=
symbol - (?<==)
('|")
.*?\1
>
somewhere ahead of our match - (?=.*?>)
Upvotes: 5