Jorge
Jorge

Reputation: 18237

Improve regular expression for URL

I have this regular expression for URL

"^(((https?|ftp|file|)://)|(www))[-A-Za-z0-9+&@#/%?=~_|!:,.;]*[-A-Za-z0-9+&@#/%=~_|]$"

Almost all my test scenarios works except one

"www.foo" <---WRONG this url it's not valid for my system
"www.foo.com" <--- valid
"www.blah.net" <--- valid
"http://blah.com" <--- valid
"https://blah.com" <--- valid

Does anybody could help me to improve my regular expression

Upvotes: 0

Views: 157

Answers (2)

sourabh kasliwal
sourabh kasliwal

Reputation: 977

Regular expressions for every valid URL

<?php 

function validateURL($val) {
  $pattern_1 = "/^(http|https|ftp):\/\/(([A-Z0-9][A-Z0-9_-]*)(\.[A-Z0-9][A-Z0-9_-]*)+.(com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se)$)(:(\d+))?\/?/i";
  $pattern_2 = "/^(www)((\.[A-Z0-9][A-Z0-9_-]*)+.(in|com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se)$)(:(\d+))?\/?/i";       
  $pattern_3 = "/^(([A-Z0-9][a-zA-Z0-9_-]*)+.(co|in|com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se)$)(:(\d+))?\/?/i";
  $pattern_4 = "/^(([A-Z0-9][a-zA-Z0-9_-]*)+.(co|in|com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se*)+.(co|in|com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se)$)(:(\d+))?\/?/i";
  if(preg_match($pattern_1, $val) || preg_match($pattern_2, $val) || preg_match($pattern_3, $val) || preg_match($pattern_4, $val)){
    return true;
  } else{
    return false;
  }
}

$url = "google.com.in";
echo validateURL($url); 
?>

Upvotes: 1

skarmats
skarmats

Reputation: 1917

I would not recommend this.

www.foo for example could be a valid local host name.

Regardless of that. Let System.Uri do the hard work and access the various parts via its numerous properties

http://msdn.microsoft.com/en-us/library/system.uri.aspx

Upvotes: 4

Related Questions