Reputation: 4482
I currently use this regular expression to validate URLs:
^([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,4})(\:0*(?:6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|6[0-4][0-9]{3}|[1-5][0-9]{4}|[1-9][0-9]{1,3}|[0-9]))?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?$
It matches a fairly long list of URLs:
google.com
google.com#a1
google.com?abc=123
google.com:80
google.com:80#a1
google.com:80?abc=123
google.com:80/test
google.com:80/test#a1
google.com:80/test?abc=123
google.com:80/test?abc=123#a1
www.google.com
www.google.com#a1
www.google.com?abc=123
www.google.com:80
www.google.com:80#a1
www.google.com:80?abc=123
www.google.com:80/test
www.google.com:80/test#a1
www.google.com:80/test?abc=123
www.google.com:80/test?abc=123#a1
www.www.google.com
www.www.google.com#a1
www.www.google.com?abc=123
www.www.google.com:80
www.www.google.com:80#a1
www.www.google.com:80?abc=123
www.www.google.com:80/test
www.www.google.com:80/test#a1
www.www.google.com:80/test?abc=123
www.www.google.com:80/test?abc=123#a1
john:[email protected]
john:[email protected]#a1
john:[email protected]?abc=123
john:[email protected]:80
john:[email protected]:80#a1
john:[email protected]:80?abc=123
john:[email protected]:80/test
john:[email protected]:80/test#a1
john:[email protected]:80/test?abc=123
john:[email protected]:80/test?abc=123#a1
john:[email protected]
john:[email protected]#a1
john:[email protected]?abc=123
john:[email protected]:80
john:[email protected]:80#a1
john:[email protected]:80?abc=123
john:[email protected]:80/test
john:[email protected]:80/test#a1
john:[email protected]:80/test?abc=123
john:[email protected]:80/test?abc=123#a1
john:[email protected]
john:[email protected]#a1
john:[email protected]?abc=123
john:[email protected]:80
john:[email protected]:80#a1
john:[email protected]:80?abc=123
john:[email protected]:80/test
john:[email protected]:80/test#a1
john:[email protected]:80/test?abc=123
john:[email protected]:80/test?abc=123#a1
However, it does not match these URLs which, to my knowledge, are also valid:
8.8.8.8
8.8.8.8#a1
8.8.8.8?abc=123
8.8.8.8:80
8.8.8.8:80#a1
8.8.8.8:80?abc=123
8.8.8.8:80/test
8.8.8.8:80/test#a1
8.8.8.8:80/test?abc=123
8.8.8.8:80/test?abc=123#a1
john:[email protected]
john:[email protected]#a1
john:[email protected]?abc=123
john:[email protected]:80
john:[email protected]:80#a1
john:[email protected]:80?abc=123
john:[email protected]:80/test
john:[email protected]:80/test#a1
john:[email protected]:80/test?abc=123
john:[email protected]:80/test?abc=123#a1
For reference, I found this one for IP addresses which seems to work well:
^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
How can I tie them together? Or, is there a better regex to match all the URLs here?
Demo:
http://rubular.com/r/ufuNkHqX5G
Upvotes: 0
Views: 108
Reputation: 3832
Validating an email address is a bit more complicated than validating a webpage URL. In fact, determining a proper regex for validating an email address seems to be a question without one definitive right answer; see Using a regular expression to validate an email address
If you use PHP, you aren't limited to using a regex to validate email addresses and URLs, as the following code illustrates:
<?php
$url = "http://8.8.8.8";
$mess = (!filter_var($url, FILTER_VALIDATE_URL))? "invalid" : "valid";
echo $mess, ": $url\n";
$email = "me@he re.com";
$mess = (!filter_var($email, FILTER_VALIDATE_EMAIL))? "invalid" :"valid";
echo $mess, ": $email\n";
Upvotes: 0
Reputation: 3943
You can combine two regular expressions together by doing (?:<regex1>|<regex2>)
, which means whatever matches regex1 or regex2. (The ?:
means the added parentheses won't capture).
You can find a variety of regexes for URL validation online, e.g. In search of the perfect URL validation regex lists quite a few.
Upvotes: 1