Reputation: 391
I have to recognize url in some text.
I use the following code (this.value
is the text):
if (new RegExp("([a-zA-Z0-9]+://)?([a-zA-Z0-9_]+:[a-zA-Z0-9_]+@)?([a-zA-Z0-9.-]+\\.[A-Za-z]{2,4})(:[0-9]+)?(/.*)?").test(this.value)) {
alert("url inside");
}
The problem that is recognize also email address as url. How can I prevent it?
Upvotes: 1
Views: 2315
Reputation: 3180
The expression /[a-zA-Z0-9_]/
is the same as /\w/i
.
The original RegExp matches the "domain.org" substring in a text like "text [email protected] text mailto:[email protected] text". To fix this add (?:^|[^@\.\w-])
at the beginning of the RegExp - a substring should be at the beginning of a line or should not begin with characters '@', '.', '-', '\w'.
To exclude "mailto:user@..." substrings the expression ([a-zA-Z0-9_]+:[a-zA-Z0-9_]+@)?
should be modified. Because Javascript RegExp has no look-behind expressions the only way to exclude "mailto" is to use the look-ahead expression \w(?!ailto:)\w+:
, but all substrings like "[a-zA-Z0-9_]ailto:...@..." will be excluded also.
To exclude from matches the substring "user.name" from a text like "text [email protected] text" add the expression (?=$|[^@\.\w-])
at the ending of the RegExp - match a substring only if the end of line follows the substring or the following characters '@', '.', '-', '\w' don't follow the substring.
var re = /(?:^|[^@\.\w-])([a-z0-9]+:\/\/)?(\w(?!ailto:)\w+:\w+@)?([\w.-]+\.[a-z]{2,4})(:[0-9]+)?(\/.*)?(?=$|[^@\.\w-])/im;
//if (re.test(this.value)) {
// alert("url inside");
//}
var s1 = "text [email protected] [email protected] text mailto:[email protected] text";
if (re.test(s1)) {
alert("Failed: text without URL");
}
var s2 = "text http://domain.org/ text";
if (!re.test(s2)) {
alert("Failed: text with URL");
}
alert("OK");
Upvotes: 2