Reputation: 41
I'm looking for a regex that matches valid, online URLs only.
For example:
exmaple.com
http://exmaple.com
https://exmaple.com
www.exmaple.com
http://www.example.com
https://www.example.com
And special domains and extensions like:
t.co
example.deals
sh.party
And so on, but won't match all the complicated stuff like ftp
, get
queries or URLs like 2.3.3.1
.
I've been using '#(www\.|https?://)?[a-z0-9]+\.[a-z0-9]{2,4}\S*#i'
but It detects dates, for example 3.3.2017
.
I need this becuase I apply get_headers
to every found URL, and when I do get_headers
for invalid URLs like a date, I get:
get_headers(http://03.03.2017): failed to open stream: Connection timed out
TL;DR: I'm looking for a Regex that matches only URLs you can apply get_headers()
on.
Thanks for helping!
Upvotes: 4
Views: 259
Reputation: 600
#(https?:\/\/)?([a-z0-9_~-]+\.)+[a-z]{2,5}(\/\S*)?#i
EDIT: Third try: Optional http or https at start. After that follows at least one domain name and a dot, then a top-level domain of 2-5 letters and an optional tail of a backslash and additional non-space characters.
Upvotes: 1
Reputation: 197
I would say Regex is not the best solution for checking valid URL. It would be better to use FILTER_VALIDATE_URL:
<?php
$url = "https://www.w3schools.com";
if (!filter_var($url, FILTER_VALIDATE_URL) === false) {
echo("$url is a valid URL");
} else {
echo("$url is not a valid URL");
}
?>
Upvotes: 2