Reputation: 1531
I would like to have this regex:
.match(/wtflungcancer.com\/\S*(?<!js)/i)
NOT match the following string based on the fact that 'js' is present. However, the following matches the entire URL:
"http://www.wtflungcancer.com/wp-content/plugins/contact-form-7/includes/js/jquery.form.min.js?ver=3.32.0-2013.04.03".match(/wtflungcancer.com\/\S*(?<!js)/i)
Upvotes: 1
Views: 79
Reputation: 89557
You can try with this pattern:
wtflungcancer.com\/(?>[^\s.]++|\.++(?!js))*(?!\.)
Explanations:
The goal is to allow all characters that are not a space or a dot followed by js
:
(?> # open an atomic group
[^\s.]++ # all characters but white characters and .
| # OR
\.++(?!js) # . not followed by js
)* # close the atomic group, repeat zero or more times
To be sure that your pattern check all the url string, i add a lookahead that check if a dot don't follow.
Upvotes: 1
Reputation: 3999
This happens because \S* eats all the characters, so the lookbehind is never activated.
Something like this should work:
/wtflungcancer.com(?!\S*\.js)/i
Basically
-- EDIT: more explanation added --
What is the difference between
"wtflungcancer.com\S*(?<!\.js)"
and
"wtflungcancer.com(?!\S*\.js)"
They look really similar!
Lookarounds (lookahead and lookbehind) in regular expressions tell the regexp engine when a match is correct or not: they do not consume characters of the string.
Especially lookbehinds tell the regexp engine to look backwards, in your case the lookbehind wasn't anchored on the right side, so the "\S*" just consumed all the non whitespace characters in the string.
For example, this regexp can work for finding url NOT ending with ".js":
wtflungcancer.com\S+(?<!\.js)$
See? The right side of the lookbehind is anchored using the end of string metacharacter.
In our case, though we couldn't hook anything to the right side, so I switched from lookbehind to lookahead
So, the real regular expression just matches "wtflungcancer.com": at that point, the lookahead tells the regexp engine: "In order for this match to be correct, this string must not be followed by a sequence of non-whitespace characters followed by '.js'". This works because lookaheads do not consume actual characters, they just move on character by character to see if the match is good or not.
Upvotes: 2