Reputation: 4870
This regex is written to disallow the urls starting with any url scheme and slashes(forward slash, backward slash) but will allow urls like "domain.tld" which are not starting with any url scheme or slashes. It should also allow the strings which are not url("some random input").
^(?!://)((?!//))(?!(.*?)*://)(?!:\\\\)(?!:/\\\\/\\\\)(?!(.*?)*:/\\\\/\\\\)(?!/\\\\/\\\\)(?!\\\\)(?!(.*?)*:\\\\)(?!www.)(?!(.*?)*.www.).*$
This regex works fine in java but in javscript, it is failing for longer strings.
Example: It works fine for "hey. hey hey hey hey"
but starts taking time with "hey. hey hey hey hey "
and hangs after "hey. hey hey hey hey hey hey"
Following are the cases which should be tested against the regex:
String | Expected result __________________________________________ http://www.google.com | False HTTP://WWW.google.com | False adasd://www.google.com | False ftp://www.google.com | False mailto://www.google.com | False //www.google.com | False ://www.google.com | False www.google.com | False WWW.google.com | False test .http://google.com | False skksdwww.google.com | False wWW.google.com | False ://google.com | False .www.google.com | False as;;; .wwW.google.com | False as.wwW.google.com | False = #$@%@#.www.google.com | False http:/\\/\\google.com | False :/\\/\\google.com | False http://gogle.com | False gogle.com //google.com | False google.com | true some random input | true
What could be the problem in it?
UPDATE: I have updated the regex as per Wiktor Stribiżew's comment and it works fine.
Upvotes: 4
Views: 139
Reputation: 626709
The (.*?)*
subpattern is disastrous inside larger patterns. The nested *
quantifiers (lazy inside and greedy outside) allow the regex engine to check a huge amount of substring variations before a failure occurs with a string that should not be matched.
Always test your patterns against strings that should not match.
Also, if you need to match a literal dot, escape it.
Here is your fixed and contracted regex:
^
(?!.*?:?//)
(?!:(?:/\\\\/)?\\\\)
(?!(?:.*?:)?(?:/\\\\/)?\\\\)
(?!(?:.*?\.)?www\.)
.*
$
Or a one-liner:
^(?!.*?:?//)(?!:(?:/\\\\/)?\\\\)(?!(?:.*?:)?(?:/\\\\/)?\\\\)(?!(?:.*?\.)?www\.).*$
See the regex demo
Upvotes: 1
Reputation: 2152
I didn't examine the whole thing (wow, that's a lot of slashes!) but you could greatly simplify the regex, I'm guessing. Just going from your post, maybe this would work for you:
/^(?!.*(?:www\.|\/\/|\/\\\\\/\\\\))/i
Should test negative for any protocol, including an empty one, including file:
urls. Let me know if I've missed anything the regex needs to test for.
UPDATE: Now passes all tests.
Upvotes: 1