Bungle
Bungle

Reputation: 19712

How to extract base URL from a string in JavaScript?

I'm trying to find a relatively easy and reliable method to extract the base URL from a string variable using JavaScript (or jQuery).

For example, given something like:

http://www.sitename.com/article/2009/09/14/this-is-an-article/

I'd like to get:

http://www.sitename.com/

Is a regular expression the best bet? If so, what statement could I use to assign the base URL extracted from a given string to a new variable?

I've done some searching on this, but everything I find in the JavaScript world seems to revolve around gathering this information from the actual document URL using location.host or similar.

Upvotes: 188

Views: 322309

Answers (22)

Surya R Praveen
Surya R Praveen

Reputation: 3745

Use new URL method to apply the origin, hostname, pathname etc.

var urlpath = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/'
const mynewurlpath = new URL(urlpath);
console.log(mynewurlpath .origin)

enter image description here

Upvotes: 0

Alexander
Alexander

Reputation: 7832

Implementation:

const getOriginByUrl = url => url.split('/').slice(0, 3).join('/');

Test:

getOriginByUrl('http://www.sitename.com:3030/article/2009/09/14/this-is-an-article?lala=kuku');

Result:

'http://www.sitename.com:3030'

Upvotes: 2

epascarello
epascarello

Reputation: 207501

There is no reason to do splits to get the path, hostname, etc from a string that is a link. You just need to use a link

//create a new element link with your link
var a = document.createElement("a");
a.href="http://www.sitename.com/article/2009/09/14/this-is-an-article/";

//hide it from view when it is added
a.style.display="none";

//add it
document.body.appendChild(a);

//read the links "features"
alert(a.protocol);
alert(a.hostname)
alert(a.pathname)
alert(a.port);
alert(a.hash);

//remove it
document.body.removeChild(a);

You can easily do it with jQuery appending the element and reading its attr.

Update: There is now new URL() which simplifies it

const myUrl = new URL("https://www.example.com:3000/article/2009/09/14/this-is-an-article/#m123")

const parts = ['protocol', 'hostname', 'pathname', 'port', 'hash'];

parts.forEach(key => console.log(key, myUrl[key]))

Upvotes: 42

V. Sambor
V. Sambor

Reputation: 13379

A good way is to use JavaScript native api URL object. This provides many usefull url parts.

For example:

const url = 'https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript'

const urlObject = new URL(url);

console.log(urlObject);


// RESULT: 
//________________________________
hash: "",
host: "stackoverflow.com",
hostname: "stackoverflow.com",
href: "https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript",
origin: "https://stackoverflow.com",
password: "",
pathname: "/questions/1420881/how-to-extract-base-url-from-a-string-in-javaript",
port: "",
protocol: "https:",
search: "",
searchParams: [object URLSearchParams]
... + some other methods

As you can see here you can just access whatever you need.

For example: console.log(urlObject.host); // "stackoverflow.com"

doc for URL

Upvotes: 9

Hasib Ullah Khan
Hasib Ullah Khan

Reputation: 1

var tilllastbackslashregex = new RegExp(/^.*\//);
baseUrl = tilllastbackslashregex.exec(window.location.href);

window.location.href gives the current url address from browser address bar

it can be any thing like https://stackoverflow.com/abc/xyz or https://www.google.com/search?q=abc tilllastbackslashregex.exec() run regex and retun the matched string till last backslash ie https://stackoverflow.com/abc/ or https://www.google.com/ respectively

Upvotes: 0

Tom Kay
Tom Kay

Reputation: 1551

To get the origin of any url, including paths within a website (/my/path) or schemaless (//example.com/my/path), or full (http://example.com/my/path) I put together a quick function.

In the snippet below, all three calls should log https://stacksnippets.net.

function getOrigin(url)
{
  if(/^\/\//.test(url))
  { // no scheme, use current scheme, extract domain
    url = window.location.protocol + url;
  }
  else if(/^\//.test(url))
  { // just path, use whole origin
    url = window.location.origin + url;
  }
  return url.match(/^([^/]+\/\/[^/]+)/)[0];
}

console.log(getOrigin('https://stacksnippets.net/my/path'));
console.log(getOrigin('//stacksnippets.net/my/path'));
console.log(getOrigin('/my/path'));

Upvotes: 1

user170442
user170442

Reputation:

Edit: Some complain that it doesn't take into account protocol. So I decided to upgrade the code, since it is marked as answer. For those who like one-line-code... well sorry this why we use code minimizers, code should be human readable and this way is better... in my opinion.

var pathArray = "https://somedomain.com".split( '/' );
var protocol = pathArray[0];
var host = pathArray[2];
var url = protocol + '//' + host;

Or use Davids solution from below.

Upvotes: 229

devansvd
devansvd

Reputation: 1009

Well, URL API object avoids splitting and constructing the url's manually.

 let url = new URL('https://stackoverflow.com/questions/1420881');
 alert(url.origin);

Upvotes: 38

Abdennour TOUMI
Abdennour TOUMI

Reputation: 93173

String.prototype.url = function() {
  const a = $('<a />').attr('href', this)[0];
  // or if you are not using jQuery 👇🏻
  // const a = document.createElement('a'); a.setAttribute('href', this);
  let origin = a.protocol + '//' + a.hostname;
  if (a.port.length > 0) {
    origin = `${origin}:${a.port}`;
  }
  const {host, hostname, pathname, port, protocol, search, hash} = a;
  return {origin, host, hostname, pathname, port, protocol, search, hash};

}

Then :

'http://mysite:5050/pke45#23'.url()
 //OUTPUT : {host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050", protocol: "http:",hash:"#23",origin:"http://mysite:5050"}

For your request, you need :

 'http://mysite:5050/pke45#23'.url().origin

Review 07-2017 : It can be also more elegant & has more features

const parseUrl = (string, prop) =>  {
  const a = document.createElement('a'); 
  a.setAttribute('href', string);
  const {host, hostname, pathname, port, protocol, search, hash} = a;
  const origin = `${protocol}//${hostname}${port.length ? `:${port}`:''}`;
  return prop ? eval(prop) : {origin, host, hostname, pathname, port, protocol, search, hash}
}

Then

parseUrl('http://mysite:5050/pke45#23')
// {origin: "http://mysite:5050", host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050"…}


parseUrl('http://mysite:5050/pke45#23', 'origin')
// "http://mysite:5050"

Cool!

Upvotes: 16

abelabbesnabi
abelabbesnabi

Reputation: 1817

This, works for me:

var getBaseUrl = function (url) {
  if (url) {
    var parts = url.split('://');
    
    if (parts.length > 1) {
      return parts[0] + '://' + parts[1].split('/')[0] + '/';
    } else {
      return parts[0].split('/')[0] + '/';
    }
  }
};

Upvotes: 0

Michael_Scharf
Michael_Scharf

Reputation: 34498

I use a simple regex that extracts the host form the url:

function get_host(url){
    return url.replace(/^((\w+:)?\/\/[^\/]+\/?).*$/,'$1');
}

and use it like this

var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/'
var host = get_host(url);

Note, if the url does not end with a / the host will not end in a /.

Here are some tests:

describe('get_host', function(){
    it('should return the host', function(){
        var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://www.sitename.com/');
    });
    it('should not have a / if the url has no /', function(){
        var url = 'http://www.sitename.com';
        assert.equal(get_host(url),'http://www.sitename.com');
    });
    it('should deal with https', function(){
        var url = 'https://www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'https://www.sitename.com/');
    });
    it('should deal with no protocol urls', function(){
        var url = '//www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'//www.sitename.com/');
    });
    it('should deal with ports', function(){
        var url = 'http://www.sitename.com:8080/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://www.sitename.com:8080/');
    });
    it('should deal with localhost', function(){
        var url = 'http://localhost/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://localhost/');
    });
    it('should deal with numeric ip', function(){
        var url = 'http://192.168.18.1/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://192.168.18.1/');
    });
});

Upvotes: 7

David
David

Reputation: 3422

WebKit-based browsers, Firefox as of version 21 and current versions of Internet Explorer (IE 10 and 11) implement location.origin.

location.origin includes the protocol, the domain and optionally the port of the URL.

For example, location.origin of the URL http://www.sitename.com/article/2009/09/14/this-is-an-article/ is http://www.sitename.com.

To target browsers without support for location.origin use the following concise polyfill:

if (typeof location.origin === 'undefined')
    location.origin = location.protocol + '//' + location.host;

Upvotes: 154

Alain Beauvois
Alain Beauvois

Reputation: 5926

This works:

location.href.split(location.pathname)[0];

Upvotes: 2

BMiner
BMiner

Reputation: 17057

If you are extracting information from window.location.href (the address bar), then use this code to get http://www.sitename.com/:

var loc = location;
var url = loc.protocol + "//" + loc.host + "/";

If you have a string, str, that is an arbitrary URL (not window.location.href), then use regular expressions:

var url = str.match(/^(([a-z]+:)?(\/\/)?[^\/]+\/).*$/)[1];

I, like everyone in the Universe, hate reading regular expressions, so I'll break it down in English:

  • Find zero or more alpha characters followed by a colon (the protocol, which can be omitted)
  • Followed by // (can also be omitted)
  • Followed by any characters except / (the hostname and port)
  • Followed by /
  • Followed by whatever (the path, less the beginning /).

No need to create DOM elements or do anything crazy.

Upvotes: 8

kta
kta

Reputation: 20110

var host = location.protocol + '//' + location.host + '/';

Upvotes: 21

daddywoodland
daddywoodland

Reputation: 1512

Don't need to use jQuery, just use

location.hostname

Upvotes: 44

Nimesh07
Nimesh07

Reputation: 362

You can use below codes for get different parameters of Current URL

alert("document.URL : "+document.URL);
alert("document.location.href : "+document.location.href);
alert("document.location.origin : "+document.location.origin);
alert("document.location.hostname : "+document.location.hostname);
alert("document.location.host : "+document.location.host);
alert("document.location.pathname : "+document.location.pathname);

Upvotes: 7

shaikh
shaikh

Reputation: 1395

function getBaseURL() {
    var url = location.href;  // entire url including querystring - also: window.location.href;
    var baseURL = url.substring(0, url.indexOf('/', 14));


    if (baseURL.indexOf('http://localhost') != -1) {
        // Base Url for localhost
        var url = location.href;  // window.location.href;
        var pathname = location.pathname;  // window.location.pathname;
        var index1 = url.indexOf(pathname);
        var index2 = url.indexOf("/", index1 + 1);
        var baseLocalUrl = url.substr(0, index2);

        return baseLocalUrl + "/";
    }
    else {
        // Root Url for domain name
        return baseURL + "/";
    }

}

You then can use it like this...

var str = 'http://en.wikipedia.org/wiki/Knopf?q=1&t=2';
var url = str.toUrl();

The value of url will be...

{
"original":"http://en.wikipedia.org/wiki/Knopf?q=1&t=2",<br/>"protocol":"http:",
"domain":"wikipedia.org",<br/>"host":"en.wikipedia.org",<br/>"relativePath":"wiki"
}

The "var url" also contains two methods.

var paramQ = url.getParameter('q');

In this case the value of paramQ will be 1.

var allParameters = url.getParameters();

The value of allParameters will be the parameter names only.

["q","t"]

Tested on IE,chrome and firefox.

Upvotes: 3

alexandru.topliceanu
alexandru.topliceanu

Reputation: 2364

A lightway but complete approach to getting basic values from a string representation of an URL is Douglas Crockford's regexp rule:

var yourUrl = "http://www.sitename.com/article/2009/09/14/this-is-an-article/";
var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;
var parts = parse_url.exec( yourUrl );
var result = parts[1]+':'+parts[2]+parts[3]+'/' ;

If you are looking for a more powerful URL manipulation toolkit try URI.js It supports getters, setter, url normalization etc. all with a nice chainable api.

If you are looking for a jQuery Plugin, then jquery.url.js should help you

A simpler way to do it is by using an anchor element, as @epascarello suggested. This has the disadvantage that you have to create a DOM Element. However this can be cached in a closure and reused for multiple urls:

var parseUrl = (function () {
  var a = document.createElement('a');
  return function (url) {
    a.href = url;
    return {
      host: a.host,
      hostname: a.hostname,
      pathname: a.pathname,
      port: a.port,
      protocol: a.protocol,
      search: a.search,
      hash: a.hash
    };
  }
})();

Use it like so:

paserUrl('http://google.com');

Upvotes: 10

sova
sova

Reputation: 101

Instead of having to account for window.location.protocol and window.location.origin, and possibly missing a specified port number, etc., just grab everything up to the 3rd "/":

// get nth occurrence of a character c in the calling string
String.prototype.nthIndex = function (n, c) {
    var index = -1;
    while (n-- > 0) {
        index++;
        if (this.substring(index) == "") return -1; // don't run off the end
        index += this.substring(index).indexOf(c);
    }
    return index;
}

// get the base URL of the current page by taking everything up to the third "/" in the URL
function getBaseURL() {
    return document.URL.substring(0, document.URL.nthIndex(3,"/") + 1);
}

Upvotes: 3

Wayne
Wayne

Reputation: 501

If you're using jQuery, this is a kinda cool way to manipulate elements in javascript without adding them to the DOM:

var myAnchor = $("<a />");

//set href    
myAnchor.attr('href', 'http://example.com/path/to/myfile')

//your link's features
var hostname = myAnchor.attr('hostname'); // http://example.com
var pathname = myAnchor.attr('pathname'); // /path/to/my/file
//...etc

Upvotes: 12

Clement Herreman
Clement Herreman

Reputation: 10536

You can do it using a regex :

/(http:\/\/)?(www)[^\/]+\//i

does it fit ?

Upvotes: 1

Related Questions