Reputation: 5761
I have the following REGEX:
/(?:(?:src|href|poster|altimg|data)\s*=\s*)(?!['"]?(?:http|#|mailto:|data:|tel:|sms:))['"]([^'">]+)|(?:url\()(?!['"]?(?:http))['"]?([^'")]+)/gm
In case in the string contains a substring which is data-href
, it matches it.
For example:
<figure class="figure--fullwidth figure--linked" data-figure-id="w-3-1" data-index="20"><a class="figure__link" data-href="..\images\W-3-1.png" tabindex="0" data-size="1218x920" data-index="21"><img class="figure__image figure__thumbnail" alt="" src="../images/W-3-1.png" data-image-id="w-3-1" data-index="22" data-size="%7B%22width%22%3A1218%2C%22height%22%3A920%7D"></a></figure>
Here it matches data-href="..\images\W-3-1.png"
I don't want to match it, only in case the string is href.....
.
How can modify the regex that it will only match cases when it's href
and not data-href
?
Thanks in advance.
Upvotes: 1
Views: 46
Reputation: 786291
You can use a negative lookbehind assertion to make sure to match only specified attributes using your regex:
/(?<!\S)(?:(?:src|href|poster|altimg|data)\s*=\s*)(?!['"]?(?:http|#|mailto:|data:|tel:|sms:))['"]([^'">]+)|(?:url\()(?!['"]?(?:http))['"]?([^'")]+)/
Here is a regex demo
(?<!\S)
is negative lookbehind assertion that doesn't allow any non-whitespace character before matching your attributes.
Upvotes: 1