motis10
motis10

Reputation: 2596

JS Regex url validation

I tried to validate url with or without http No matter what i did the function return false. I checked my regex string in this site: http://regexr.com/ And its seen as i expect.

    function isUrlValid(userInput) {
        var regexQuery = "/(http(s)?://.)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)/";
        var url = new RegExp(regexQuery,"g");
        if (url.test(userInput)) {
            alert('Great, you entered an E-Mail-address');
            return true;
        }
        return false;
    }

I fix the problem by change the .test to .match and leave the regex as is.

Upvotes: 25

Views: 101182

Answers (6)

akash gupta
akash gupta

Reputation: 1

function isUrlValid(userInput) {
    var regexQuery = "^(https?:\\/\\/)?((([-a-z0-9]{1,63}\\.)*?[a-z0-9]([-a-z0-9]{0,253}[a-z0-9])?\\.[a-z]{2,63})|((\\d{1,3}\\.){3}\\d{1,3}))(:\\d{1,5})?((\\/|\\?)((%[0-9a-f]{2})|[-\\w\\+\\.\\?\\/@~#&=])*)?$";
    var url = new RegExp(regexQuery,"i");
    return url.test(userInput);
}
var input = ["http://localhost/pwc/public/enus/forms/pwc-external-learning-object/NTA3NDA/NTg3",
             "HTTP://EX-AMPLE.COM",
             "example.c",
             "example-.com",
             "www.police.academy",
             "https://x.com/?twitter?",
             "12.34.56.78:9000",
             "http://example.com?a=%bc&d=%ef&g=%H"];
for (var i in input) document.write(isUrlValid(input[i]) + ": " + input[i] + "<br>");

Upvotes: 0

Bullsized
Bullsized

Reputation: 577

Here's my TypeScript solution and a link to test it:

/**
 * This regex pattern aims to match URLs that start with optional protocols (http://, https://, or ftp://),
 * followed by a domain name, domain extension, and various characters that form the path, query, or fragment part of
 * the URL, as well as allowing `%` as a valid character (for encoded characters).
 *
 * https://regexr.com/7q3qi
 */
const URL_PATTERN: RegExp = /^(?:(?:http|https|ftp):\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-._~:/?#[\]@!$&'()*+,;=%]+$/;

Upvotes: 0

I believe the other answer will reject some valid url's (like domain names in uppercase or long sub-domains) and allow some invalid ones (like www.-example-.com or www.%@&.com). I tried to take into account a number of additional url syntax rules (without getting into internationalisation).

function isUrlValid(userInput) {
    var regexQuery = "^(https?:\\/\\/)?((([-a-z0-9]{1,63}\\.)*?[a-z0-9]([-a-z0-9]{0,253}[a-z0-9])?\\.[a-z]{2,63})|((\\d{1,3}\\.){3}\\d{1,3}))(:\\d{1,5})?((\\/|\\?)((%[0-9a-f]{2})|[-\\w\\+\\.\\?\\/@~#&=])*)?$";
    var url = new RegExp(regexQuery,"i");
    return url.test(userInput);
}
var input = ["https://a.long.sub-domain.example.com/foo/bar?foo=bar&boo=far#a%20b",
             "HTTP://EX-AMPLE.COM",
             "example.c",
             "example-.com",
             "www.police.academy",
             "https://x.com/?twitter?",
             "12.34.56.78:9000",
             "http://example.com?a=%bc&d=%ef&g=%H"];
for (var i in input) document.write(isUrlValid(input[i]) + ": " + input[i] + "<br>");

Here's a breakdown of the regex:

^                                      // start of URL

(                                      // protocol section
    https?                             // http or https
    :\\/\\/                            // colon and double slash
)?                                     // section can be omitted

(                                      // domain or IP address
    (
        (                              // sub-domain section
            [-a-z0-9]{1,63}            // 1 to 63 characters
            \\.                        // followed by dot
        )*?                            // any number of sections (lazy)

        [a-z0-9]                       // no hyphen at start
        (
            [-a-z0-9]{0,253}           // domain name
            [a-z0-9]                   // no hyphen at end
        )?                             // allow 1-letter domains

        \\.                            // dot
        [a-z]{2,63}                    // top-level domain
    )
    |                                  // or ...
    (                                  // IP address
        (
            \\d{1,3}                   // 1 to 3 digits
            \\.                        // followed by dot
        ){3}                           // three times
        \\d{1,3}                       // 1 to 3 digits
    )
)

(                                      // port section
    :                                  // colon
    \\d{1,5}                           // port number
)?                                     // section can be omitted

(                                      // file path and/or query section
    (                                  // section must start with ...
        \\/                            // slash
        |                              // or ...
        \\?                            // question mark
    )
    (
        (                              // escaped character
            %                          // percent
            [0-9a-f]{2}                // hex number
        )
        |                              // or ...
        [                              // literal character
            -                          // hyphen
            \\w                        // letter, digit or underscore
            \\+                        // plus
            \\.                        // dot
            \\?                        // question mark
            \\/                        // slash
            @~#&=                      // at, tilde, hash, ampersand, equal sign
        ]
    )*                                 // any number of characters
)?                                     // section can be omitted

$                                      // end of URL

Note that the regex is used in case-insensitive mode, because capital letters are allowed in every part of a url.

Theoretically, there should always be a slash between the domain and a query, but in the wild you will find a lot of urls with the domain immediately followed by a question mark, so I've allowed those.

There are also rules on the maximum length of a url, so you may want to check that separately.

(The original answer was written in 2015, but I've updated it because longer top-level domains are now in use, and single-letter domains have become more relevant because of x.com).

Upvotes: 13

AmerllicA
AmerllicA

Reputation: 32472

Actually, this question needs a powerful regex and the following code is not very hard to understand, please see below(ES6 - TypeScript):

const isValidUrl = (url: string): boolean => {
  const urlRegex = /^((http(s?)?):\/\/)?([wW]{3}\.)?[a-zA-Z0-9\-.]+\.[a-zA-Z]{2,}(\.[a-zA-Z]{2,})?$/g;
  const result = url.match(urlRegex);

  return result !== null;
};

Upvotes: 8

Rahul Mahadik
Rahul Mahadik

Reputation: 1020

Try this code.

function CheckURL(fieldId, alertMessage) {
    var url = fieldId.value;
    if(url !== "")
    {
        if (url.match(/(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)/g) !== null)
            return true;
        else {
            alert(alertMessage);
            fieldId.focus();
            return false;
        }
    }
}

var website = document.getElementById('Website');
if (!CheckURL(website, "Enter a valid website address")) {
    return false;
}

Upvotes: 1

motis10
motis10

Reputation: 2596

I change the function to Match + make a change here with the slashes and its work: (http(s)?://.)

The fixed function:

function isUrlValid(userInput) {
    var res = userInput.match(/(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)/g);
    if(res == null)
        return false;
    else
        return true;
}

Upvotes: 45

Related Questions