Reputation: 4175
Does someone have a regex for validating urls (NOT for finding them inside a text passage)? JavaScript snippet would be preferred.
Upvotes: 39
Views: 159070
Reputation: 41
The regex you provided is almost correct for matching URLs with optional valid protocols. However, it can be refined for better accuracy and readability. Here's an improved version:
^(https?:\/\/|ftp:\/\/)?(www\.)?([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})(\/[a-zA-Z0-9#\/?=&%.\-]*)?$
^
: Start of the string.(https?:\/\/|ftp:\/\/)?
: Matches http://
, https://
, or ftp://
and makes it optional.(www\.)?
: Matches www.
and makes it optional.([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})
: Matches the domain name.
[a-z0-9]+
: Matches the initial part of the domain.([\-\.]{1}[a-z0-9]+)*
: Matches subsequent parts of the domain that may include -
or .
followed by alphanumeric characters.\.[a-z]{2,6}
: Matches the top-level domain (e.g., .com, .in).(\/[a-zA-Z0-9#\/?=&%.\-]*)?
: Matches the path and query string, including allowed characters, and makes it optional.$
: End of the string.const regex = /^(https?:\/\/|ftp:\/\/)?(www\.)?([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})(\/[a-zA-Z0-9#\/?=&%.\-]*)?$/;
const testUrls = [
"yourwebsite.com",
"yourwebsite.com/4564564/546564/546564?=adsfasd",
"www.yourwebsite.com",
"http://yourwebsite.com",
"https://yourwebsite.com",
"ftp://www.yourwebsite.com",
"ftp://yourwebsite.com",
"http://yourwebsite.com/4564564/546564/546564?=adsfasd",
"google.in",
"fb.co",
"live.com",
"test.live",
"http://test.google",
"lop://live.in"
];
testUrls.forEach(url => {
console.log(url.match(regex) ? `Matches: ${url}` : `Does not match: ${url}`);
});
This regex should produce the expected results:
Matches: yourwebsite.com
Matches: yourwebsite.com/4564564/546564/546564?=adsfasd
Matches: www.yourwebsite.com
Matches: http://yourwebsite.com
Matches: https://yourwebsite.com
Matches: ftp://www.yourwebsite.com
Matches: ftp://yourwebsite.com
Matches: http://yourwebsite.com/4564564/546564/546564?=adsfasd
Matches: google.in
Matches: fb.co
Matches: live.com
Matches: test.live
Matches: http://test.google
Does not match: lop://live.in
This approach ensures that URLs with valid protocols are matched and invalid protocols are not.
Upvotes: 0
Reputation: 135
Using the power of javascript only, a good approach in some cases is to use
let urlToValidate = `${decodeURIComponent(url)}`
const isValidUrl = (url = '') => {
try {
new URL(url);
return true;
} catch (error) {
return false;
}
};
let result = isValidUrl(urlToValidate)
console.log(result)
Upvotes: 0
Reputation: 6514
I have tried a few but there were a few issues so I came up with this one.
/(https?:\/\/(?:www\d*\.|(?!www\d*\.))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\d*\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\d*\.|(?!www\d*\.))[a-zA-Z0-9]+\.[^\s]{2,}|www\d*\.[a-zA-Z0-9]+\.[^\s]{2,})/gi;
How to use
const isValidUrl = (url = '') => {
if (url) {
var expression =
/(https?:\/\/(?:www\d*\.|(?!www\d*\.))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\d*\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\d*\.|(?!www\d*\.))[a-zA-Z0-9]+\.[^\s]{2,}|www\d*\.[a-zA-Z0-9]+\.[^\s]{2,})/gi;
var regex = new RegExp(expression);
return !!url.match(regex);
}
return false;
};
Breakdown
/(
https?:\/\/ # matches http:// or https://
(?:www\d*\.|(?!www\d*\.) # matches an optional "www" prefix with zero or more digits, followed by a dot,
# or excludes "www" prefix followed by digits
)[a-zA-Z0-9][a-zA-Z0-9-]+ # matches the domain name
[a-zA-Z0-9]\. # matches the dot before the top-level domain
[^\s]{2,} # matches the rest of the URL after the domain name
| # or
www\d*\.[a-zA-Z0-9][a-zA-Z0-9-]+ # matches the "www" prefix with zero or more digits, followed by a dot, and the domain name
[a-zA-Z0-9]\. # matches the dot before the top-level domain
[^\s]{2,} # matches the rest of the URL after the domain name
| # or
https?:\/\/ # matches http:// or https://
(?:www\d*\.|(?!www\d*\.) # matches an optional "www" prefix with zero or more digits, followed by a dot,
# or excludes "www" prefix followed by digits
)[a-zA-Z0-9]+\.[^\s]{2,} # matches the domain name and top-level domain
| # or
www\d*\.[a-zA-Z0-9]+\.[^\s]{2,} # matches the "www" prefix with zero or more digits, followed by a dot, and the domain name and top-level domain
)/gi;
Valid URLs
http://www.example.com
https://www.example.co.uk
http://www1.example.com
http://www2.example.com
http://www3.example.com
https://www1.example.co.uk
https://www2.example.co.uk
https://www3.example.co.uk
https://example.com
http://example.com
www.example.com
www1.example.com
www2.example.com
www3.example.com
www.example.co.uk
www1.example.co.uk
www2.example.co.uk
www3.example.co.uk
Invalid URLs
example
example.com
ftp://example.com
ftp://www.example.com
http://www.example
http://www.example.
http://www.example/
http://example./com
Upvotes: 1
Reputation: 647
From https://www.freecodecamp.org/news/how-to-validate-urls-in-javascript/
function isValidHttpUrl(str) {
const pattern = new RegExp(
'^(https?:\\/\\/)?' + // protocol
'((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|' + // domain name
'((\\d{1,3}\\.){3}\\d{1,3}))' + // OR ip (v4) address
'(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*' + // port and path
'(\\?[;&a-z\\d%_.~+=-]*)?' + // query string
'(\\#[-a-z\\d_]*)?$', // fragment locator
'i'
);
return pattern.test(str);
}
console.log(isValidHttpUrl('https://www.freecodecamp.org/')); // true
console.log(isValidHttpUrl('mailto://freecodecamp.org')); // false
console.log(isValidHttpUrl('freeCodeCamp')); // false
Upvotes: 0
Reputation: 2470
/^(http|ftp)s?:\/\/((?=.{3,253}$)(localhost|(([^ ]){1,63}\.[^ ]+)))$/
explanation:
http
/ ftp
s
can follow, but not necessarily://
are a must right afterhttp://a.b
) and max of 253localhost
or domain-name.TLD
.
domain-name can be made out of multiple labels, divided by a dot
(i.e https://inner.sub.domain.net
),
and maximum length of each label is 63.
I didn't see anywhere that there's limitation on the TLD length, so I didn't put there any restriction.What @bobince answered is a real concern.
The latest answers are very close (thanks @Akseli), but they all miss the obligatory dot
in the URL and lengths.
The answer I provide above deals with those too.
for further reading:
Upvotes: 0
Reputation: 126
This REGEX is a patch from @Aamir answer that worked for me
/((?:(?:http?|ftp)[s]*:\/\/)?[a-z0-9-%\/\&=?\.]+\.[a-z]{2,4}\/?([^\s<>\#%"\,\{\}\\|\\\^\[\]`]+)?)/gi
It matches these URL formats
Upvotes: 6
Reputation: 536695
The actual URL syntax is pretty complicated and not easy to represent in regex. Most of the simple-looking regexes out there will give many false negatives as well as false positives. See for amusement these efforts but even the end result is not good.
Plus these days you would generally want to allow IRI as well as old-school URI, so we can link to valid addresses like:
http://en.wikipedia.org/wiki/Þ
http://例え.テスト/
I would go only for simple checks: does it start with a known-good method: name? Is it free of spaces and double-quotes? If so then hell, it's probably good enough.
Upvotes: 38
Reputation: 28161
In the accepted answer bobince got it right: validating only the scheme name, ://, and spaces and double quotes is usually enough. Here is how the validation can be implemented in JavaScript:
var url = 'http://www.google.com';
var valid = /^(ftp|http|https):\/\/[^ "]+$/.test(url);
// true
or
var r = /^(ftp|http|https):\/\/[^ "]+$/;
r.test('http://www.goo le.com');
// false
or
var url = 'http:www.google.com';
var r = new RegExp(/^(ftp|http|https):\/\/[^ "]+$/);
r.test(url);
// false
References for syntax:
Upvotes: 92
Reputation: 31
I couldn't find one that worked well for my needs. Written and post @ https://gist.github.com/geoffreyrobichaux/0a7774b424703b6c0fffad309ab0ad0a
function validURL(s) {
var regexp = /^(ftp|http|https|chrome|:\/\/|\.|@){2,}(localhost|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|\S*:\w*@)*([a-zA-Z]|(\d{1,3}|\.){7}){1,}(\w|\.{2,}|\.[a-zA-Z]{2,3}|\/|\?|&|:\d|@|=|\/|\(.*\)|#|-|%)*$/gum
return regexp.test(s);
}
Upvotes: 2
Reputation: 685
You can simple use type="url"
in your input and the check it with checkValidity()
in js
E.g:
your.html
<input id="foo" type="url">
your.js
$("#foo").on("keyup", function() {
if (this.checkValidity()) {
// The url is valid
} else {
// The url is invalid
}
});
Upvotes: 4
Reputation: 707
I use the /^[a-z]+:[^:]+$/i regular expression for URL validation. See an example of my cross-browser InputKeyFilter code with URL validation.
<!doctype html>
<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
<title>Input Key Filter Test</title>
<meta name="author" content="Andrej Hristoliubov [email protected]">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<!-- For compatibility of IE browser with audio element in the beep() function.
https://www.modern.ie/en-us/performance/how-to-use-x-ua-compatible -->
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<link rel="stylesheet" href="https://rawgit.com/anhr/InputKeyFilter/master/InputKeyFilter.css" type="text/css">
<script type="text/javascript" src="https://rawgit.com/anhr/InputKeyFilter/master/Common.js"></script>
<script type="text/javascript" src="https://rawgit.com/anhr/InputKeyFilter/master/InputKeyFilter.js"></script>
</head>
<body>
URL:
<input type="url" id="Url" value=":"/>
<script>
CreateUrlFilter("Url", function(event){//onChange event
inputKeyFilter.RemoveMyTooltip();
var elementNewInteger = document.getElementById("NewUrl");
elementNewInteger.innerHTML = this.value;
}
//onblur event. Use this function if you want set focus to the input element again if input value is NaN. (empty or invalid)
, function(event){ this.ikf.customFilter(this); }
);
</script>
New URL: <span id="NewUrl"></span>
</body>
</html>
Also see my page example of the input key filter.
Upvotes: 1
Reputation: 21
Try this it works for me:
/^(http[s]?:\/\/){0,1}(w{3,3}\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/;
Upvotes: 2
Reputation: 172
I've found some success with this:
/^((ftp|http|https):\/\/)?www\.([A-z]+)\.([A-z]{2,})/
It's obviously not perfect but it handled my cases pretty well
Upvotes: 6
Reputation: 2283
After a long research I build this reg expression. I hope it will help others too.......
url = 'https://google.co.in';
var re = /[a-z0-9-\.]+\.[a-z]{2,4}\/?([^\s<>\#%"\,\{\}\\|\\\^\[\]`]+)?$/;
if (!re.test(url)) {
alert("url error");
return false;
}else{
alert('success')
}
Upvotes: 3
Reputation: 187
/(?:http[s]?\/\/)?(?:[\w\-]+(?::[\w\-]+)?@)?(?:[\w\-]+\.)+(?:[a-z]{2,4})(?::[0-9]+)?(?:\/[\w\-\.%]+)*(?:\?(?:[\w\-\.%]+=[\w\-\.%!]+&?)+)?(#\w+\-\.%!)?/
Upvotes: 1
Reputation: 838
try with this:
var RegExp =/^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:\/[^\s]*)?$/i;
Upvotes: 1
Reputation: 46985
Try this regex, it works for me:
function isUrl(s) {
var regexp = /(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/
return regexp.test(s);
}
Upvotes: 1
Reputation: 39
<html>
<head>
<title>URL</title>
<script type="text/javascript">
function validate() {
var url = document.getElementById("url").value;
var pattern = /(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/;
if (pattern.test(url)) {
alert("Url is valid");
return true;
}
alert("Url is not valid!");
return false;
}
</script>
</head>
<body>
URL :
<input type="text" name="url" id="url" />
<input type="submit" value="Check" onclick="validate();" />
</body>
</html>
Upvotes: 3
Reputation: 1937
Try this regex
/(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/
It works best for me.
Upvotes: 29