Panagiotis
Panagiotis

Reputation: 261

Javascript: Website url validation with regex

I'm working on create a regular expression in javascript to validate website urls. I searched a bit in the stackoverflow community and i did not find something to be completed helpful.

My regex until now: /(https?:\/\/)?(www\.)?[a-zA-Z0-9]+\.[a-zA-Z]{2,}/g

But it seems to fail and pass the validation for the url with two w like ww.test.com

Should pass the test of regex:

http://www.test.com
https://www.test.com
www.test.com
www.test.co.uk
www.t.com
test.com
test.fr
test.co.uk

Should not pass the test of regex:

w.test.com
ww.test.com
www.test
test
ww.test.
.test
.test.com
.test.co.ul
.test.

Any suggestions or thoughts?

Upvotes: 2

Views: 9611

Answers (3)

Seph Reed
Seph Reed

Reputation: 10878

Here's a non official, but works for most things one with an explanation. This should be good enough for most situations.

(https?:\/\/)?[\w\-~]+(\.[\w\-~]+)+(\/[\w\-~]*)*(#[\w\-]*)?(\?.*)?

  1. (https?:\/\/)? - start with http:// or https:// or not
  2. [\w\-~]+(\.[\w\-~]+)+ follow it with the domain name [\w\-~] and at least one extension (\.[\w\-~])+
    • [\w\-~] == [a-zA-Z0-9_\-~]
    • Multiple extensions would mean test.go.place.com
  3. (\/[\w\-~]*)* then as many sub directories as wished
    • In order to easily make test.com/ pass, the slash does not enforce following characters. This can be abused like so: test.com/la////la.
  4. (#[\w\-]*)? Followed maybe by an element id
  5. (\?.*)? Followed maybe by url params, which (for the sake of simplicity) can be pretty much whatever

There are plenty of edge cases where this will break, or where it should but it doesn't. But, for most cases where people aren't doing anything wacky, this should work.

Upvotes: 1

Ish
Ish

Reputation: 2105

/((http|https)\:\/\/)?[a-zA-Z0-9\.\/\?\:@\-_=#]+\.([a-zA-Z0-9\&\.\/\?\:@\-_=#])*/g

Upvotes: -1

philipp
philipp

Reputation: 16485

Even if this answer is a bit too much for this Problem, it illustrates the problem: Even if it might be possible to create a regexp to check the url, it is much simpler and more robust to parse the URL and "create a real Object", on/with which the overall test can be decomposed to a number of smaller tests.

So probably the builtin URL constructor of modern browsers may help you here (LINK 1, LINK 2).

One approach to test you url might look like this:

function testURL (urlstring) {
    var errors = [];
    try {
        var url = new URL(urlstring);

        if (!/https/.test(url.protocol)) {
           errors.push('wrong protocol');
        }

        //more tests here

    } catch(err) {
      //something went really wrong
      //log the error here

    } finally {
      return errors;
    }
}


if (testURL('mr.bean').length == 0) { runSomething(); }

Upvotes: 4

Related Questions