Louis Rhys
Louis Rhys

Reputation: 35637

How to check whether a string is a valid HTTP URL?

There are the Uri.IsWellFormedUriString and Uri.TryCreate methods, but they seem to return true for file paths, etc.

How do I check whether a string is a valid (not necessarily active) HTTP URL for input validation purposes?

Upvotes: 383

Views: 327013

Answers (12)

user28253206
user28253206

Reputation: 1

def url_validator(url: str) -> bool:
    """
    use this func to filter out the urls to follow only valid urls
    :param: url
    :type: str
    :return: True if the passed url is valid otherwise return false
    :rtype: bool
    """

    #the following regex is copied from Django source code 
    # to validate a url using regax
    
    regex = re.compile(
        r"^(?:http|ftp)s?://"  # http:// or https://
        r"(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|"  # domain...
        r"localhost|"  # localhost...
        r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"  # ...or ip
        r"(?::\d+)?"  # optional port
        r"(?:/?|[/?]\S+)$",
        re.IGNORECASE,
    )


    blocked_sites: list[str] = []

    for site in blocked_sites:
        if site in url or site == url:
            return False

    # if none of the above then ensure that the url is valid and then return True otherwise return False
    if re.match(regex, url):
        return True

    return False

Upvotes: 0

Erçin Dedeoğlu
Erçin Dedeoğlu

Reputation: 5383

    public static bool CheckURLValid(this string source)
    {
        Uri uriResult;
        return Uri.TryCreate(source, UriKind.Absolute, out uriResult) && uriResult.Scheme == Uri.UriSchemeHttp;
    }

Usage:

string url = "htts://adasd.xc.";
if(url.CheckUrlValid())
{
  //valid process
}

UPDATE: (single line of code) Thanks @GoClimbColorado

public static bool CheckURLValid(this string source) => Uri.TryCreate(source, UriKind.Absolute, out Uri uriResult) && uriResult.Scheme == Uri.UriSchemeHttps;

Usage:

string url = "htts://adasd.xc.";
if(url.CheckUrlValid())
{
  //valid process
}

Upvotes: 31

Radwan
Radwan

Reputation: 74

I've created this function to help me with URL validation, you can customize it as you like, note this is written in python3.10.6

def url_validator(url: str) -> bool:
    """
    use this func to filter out the urls to follow only valid urls
    :param: url
    :type: str
    :return: True if the passed url is valid otherwise return false
    :rtype: bool
    """

    #the following regex is copied from Django source code 
    # to validate a url using regax
    
    regex = re.compile(
        r"^(?:http|ftp)s?://"  # http:// or https://
        r"(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|"  # domain...
        r"localhost|"  # localhost...
        r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"  # ...or ip
        r"(?::\d+)?"  # optional port
        r"(?:/?|[/?]\S+)$",
        re.IGNORECASE,
    )


    blocked_sites: list[str] = []

    for site in blocked_sites:
        if site in url or site == url:
            return False

    # if none of the above then ensure that the url is valid and then return True otherwise return False
    if re.match(regex, url):
        return True

    return False

Upvotes: 0

manishnitte
manishnitte

Reputation: 211

Problem: Valid URLs should include all of the following “prefixes”: https, http, www

  • Url must contain http:// or https://
  • Url may contain only one instance of www.
  • Url Host name type must be Dns
  • Url max length is 100

Solution:

public static bool IsValidUrl(string webSiteUrl)
{
   if (webSiteUrl.StartsWith("www."))
   {
       webSiteUrl = "http://" + webSiteUrl;
   }
        
   return Uri.TryCreate(webSiteUrl, UriKind.Absolute, out Uri uriResult)
            && (uriResult.Scheme == Uri.UriSchemeHttp
             || uriResult.Scheme == Uri.UriSchemeHttps) && uriResult.Host.Replace("www.", "").Split('.').Count() > 1 && uriResult.HostNameType == UriHostNameType.Dns && uriResult.Host.Length > uriResult.Host.LastIndexOf(".") + 1 && 100 >= webSiteUrl.Length;
}

Validated with Unit Tests

Positive Unit Test:

    [TestCase("http://www.example.com/")]
    [TestCase("https://www.example.com")]
    [TestCase("http://example.com")]
    [TestCase("https://example.com")]
    [TestCase("www.example.com")]
    public void IsValidUrlTest(string url)
    {
        bool result = UriHelper.IsValidUrl(url);

        Assert.AreEqual(result, true);
    }

Negative Unit Test:

    [TestCase("http.www.example.com")]
    [TestCase("http:www.example.com")]
    [TestCase("http:/www.example.com")]
    [TestCase("http://www.example.")]
    [TestCase("http://www.example..com")]
    [TestCase("https.www.example.com")]
    [TestCase("https:www.example.com")]
    [TestCase("https:/www.example.com")]
    [TestCase("http:/example.com")]
    [TestCase("https:/example.com")]
    public void IsInvalidUrlTest(string url)
    {
        bool result = UriHelper.IsValidUrl(url);

        Assert.AreEqual(result, false);
    }

Note: IsValidUrl method should not validate any relative url path like example.com

See:

Should I Use Relative or Absolute URLs?

Upvotes: 4

Ifeanyi Chukwu
Ifeanyi Chukwu

Reputation: 3317

As an alternative approach to using a regex, this code uses Uri.TryCreate per the OP, but then also checks the result to ensure that its Scheme is one of http or https:

bool passed =
  Uri.TryCreate(url, UriKind.Absolute, out Uri uriResult)
    && (uriResult.Scheme == Uri.UriSchemeHttp
      || uriResult.Scheme == Uri.UriSchemeHttps);

Upvotes: 5

Marco Concas
Marco Concas

Reputation: 1892

Try that:

bool IsValidURL(string URL)
{
    string Pattern = @"^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$";
    Regex Rgx = new Regex(Pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
    return Rgx.IsMatch(URL);
}

It will accept URL like that:

  • http(s)://www.example.com
  • http(s)://stackoverflow.example.com
  • http(s)://www.example.com/page
  • http(s)://www.example.com/page?id=1&product=2
  • http(s)://www.example.com/page#start
  • http(s)://www.example.com:8080
  • http(s)://127.0.0.1
  • 127.0.0.1
  • www.example.com
  • example.com

Upvotes: 43

41686d6564
41686d6564

Reputation: 19641

All the answers here either allow URLs with other schemes (e.g., file://, ftp://) or reject human-readable URLs that don't start with http:// or https:// (e.g., www.google.com) which is not good when dealing with user inputs.

Here's how I do it:

public static bool ValidHttpURL(string s, out Uri resultURI)
{
    if (!Regex.IsMatch(s, @"^https?:\/\/", RegexOptions.IgnoreCase))
        s = "http://" + s;

    if (Uri.TryCreate(s, UriKind.Absolute, out resultURI))
        return (resultURI.Scheme == Uri.UriSchemeHttp || 
                resultURI.Scheme == Uri.UriSchemeHttps);

    return false;
}

Usage:

string[] inputs = new[] {
                          "https://www.google.com",
                          "http://www.google.com",
                          "www.google.com",
                          "google.com",
                          "javascript:alert('Hack me!')"
                        };
foreach (string s in inputs)
{
    Uri uriResult;
    bool result = ValidHttpURL(s, out uriResult);
    Console.WriteLine(result + "\t" + uriResult?.AbsoluteUri);
}

Output:

True    https://www.google.com/
True    http://www.google.com/
True    http://www.google.com/
True    http://google.com/
False

Upvotes: 21

Kishath
Kishath

Reputation: 5784

This method works fine both in http and https. Just one line :)

if (Uri.IsWellFormedUriString("https://www.google.com", UriKind.Absolute))

MSDN: IsWellFormedUriString

Upvotes: 205

Arabela Paslaru
Arabela Paslaru

Reputation: 7074

Try this to validate HTTP URLs (uriName is the URI you want to test):

Uri uriResult;
bool result = Uri.TryCreate(uriName, UriKind.Absolute, out uriResult) 
    && uriResult.Scheme == Uri.UriSchemeHttp;

Or, if you want to accept both HTTP and HTTPS URLs as valid (per J0e3gan's comment):

Uri uriResult;
bool result = Uri.TryCreate(uriName, UriKind.Absolute, out uriResult) 
    && (uriResult.Scheme == Uri.UriSchemeHttp || uriResult.Scheme == Uri.UriSchemeHttps);

Upvotes: 629

Eranda
Eranda

Reputation: 868

Uri uri = null;
if (!Uri.TryCreate(url, UriKind.Absolute, out uri) || null == uri)
    return false;
else
    return true;

Here url is the string you have to test.

Upvotes: 1

Miserable Variable
Miserable Variable

Reputation: 28752

After Uri.TryCreate you can check Uri.Scheme to see if it HTTP(s).

Upvotes: 7

user3760031
user3760031

Reputation: 41

This would return bool:

Uri.IsWellFormedUriString(a.GetAttribute("href"), UriKind.Absolute)

Upvotes: 3

Related Questions