Reputation: 19712
I'm trying to find a relatively easy and reliable method to extract the base URL from a string variable using JavaScript (or jQuery).
For example, given something like:
http://www.sitename.com/article/2009/09/14/this-is-an-article/
I'd like to get:
http://www.sitename.com/
Is a regular expression the best bet? If so, what statement could I use to assign the base URL extracted from a given string to a new variable?
I've done some searching on this, but everything I find in the JavaScript world seems to revolve around gathering this information from the actual document URL using location.host or similar.
Upvotes: 188
Views: 322309
Reputation: 3745
Use new URL method to apply the origin, hostname, pathname etc.
var urlpath = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/'
const mynewurlpath = new URL(urlpath);
console.log(mynewurlpath .origin)
Upvotes: 0
Reputation: 7832
Implementation:
const getOriginByUrl = url => url.split('/').slice(0, 3).join('/');
Test:
getOriginByUrl('http://www.sitename.com:3030/article/2009/09/14/this-is-an-article?lala=kuku');
Result:
'http://www.sitename.com:3030'
Upvotes: 2
Reputation: 207501
There is no reason to do splits to get the path, hostname, etc from a string that is a link. You just need to use a link
//create a new element link with your link
var a = document.createElement("a");
a.href="http://www.sitename.com/article/2009/09/14/this-is-an-article/";
//hide it from view when it is added
a.style.display="none";
//add it
document.body.appendChild(a);
//read the links "features"
alert(a.protocol);
alert(a.hostname)
alert(a.pathname)
alert(a.port);
alert(a.hash);
//remove it
document.body.removeChild(a);
You can easily do it with jQuery appending the element and reading its attr.
Update: There is now new URL()
which simplifies it
const myUrl = new URL("https://www.example.com:3000/article/2009/09/14/this-is-an-article/#m123")
const parts = ['protocol', 'hostname', 'pathname', 'port', 'hash'];
parts.forEach(key => console.log(key, myUrl[key]))
Upvotes: 42
Reputation: 13379
A good way is to use JavaScript native api URL
object. This provides many usefull url parts.
For example:
const url = 'https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript'
const urlObject = new URL(url);
console.log(urlObject);
// RESULT:
//________________________________
hash: "",
host: "stackoverflow.com",
hostname: "stackoverflow.com",
href: "https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript",
origin: "https://stackoverflow.com",
password: "",
pathname: "/questions/1420881/how-to-extract-base-url-from-a-string-in-javaript",
port: "",
protocol: "https:",
search: "",
searchParams: [object URLSearchParams]
... + some other methods
As you can see here you can just access whatever you need.
For example: console.log(urlObject.host); // "stackoverflow.com"
doc for URL
Upvotes: 9
Reputation: 1
var tilllastbackslashregex = new RegExp(/^.*\//);
baseUrl = tilllastbackslashregex.exec(window.location.href);
window.location.href gives the current url address from browser address bar
it can be any thing like https://stackoverflow.com/abc/xyz or https://www.google.com/search?q=abc tilllastbackslashregex.exec() run regex and retun the matched string till last backslash ie https://stackoverflow.com/abc/ or https://www.google.com/ respectively
Upvotes: 0
Reputation: 1551
To get the origin of any url, including paths within a website (/my/path
) or schemaless (//example.com/my/path
), or full (http://example.com/my/path
) I put together a quick function.
In the snippet below, all three calls should log https://stacksnippets.net
.
function getOrigin(url)
{
if(/^\/\//.test(url))
{ // no scheme, use current scheme, extract domain
url = window.location.protocol + url;
}
else if(/^\//.test(url))
{ // just path, use whole origin
url = window.location.origin + url;
}
return url.match(/^([^/]+\/\/[^/]+)/)[0];
}
console.log(getOrigin('https://stacksnippets.net/my/path'));
console.log(getOrigin('//stacksnippets.net/my/path'));
console.log(getOrigin('/my/path'));
Upvotes: 1
Reputation:
Edit: Some complain that it doesn't take into account protocol. So I decided to upgrade the code, since it is marked as answer. For those who like one-line-code... well sorry this why we use code minimizers, code should be human readable and this way is better... in my opinion.
var pathArray = "https://somedomain.com".split( '/' );
var protocol = pathArray[0];
var host = pathArray[2];
var url = protocol + '//' + host;
Or use Davids solution from below.
Upvotes: 229
Reputation: 1009
Well, URL API object avoids splitting and constructing the url's manually.
let url = new URL('https://stackoverflow.com/questions/1420881');
alert(url.origin);
Upvotes: 38
Reputation: 93173
String.prototype.url = function() {
const a = $('<a />').attr('href', this)[0];
// or if you are not using jQuery 👇🏻
// const a = document.createElement('a'); a.setAttribute('href', this);
let origin = a.protocol + '//' + a.hostname;
if (a.port.length > 0) {
origin = `${origin}:${a.port}`;
}
const {host, hostname, pathname, port, protocol, search, hash} = a;
return {origin, host, hostname, pathname, port, protocol, search, hash};
}
Then :
'http://mysite:5050/pke45#23'.url()
//OUTPUT : {host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050", protocol: "http:",hash:"#23",origin:"http://mysite:5050"}
For your request, you need :
'http://mysite:5050/pke45#23'.url().origin
const parseUrl = (string, prop) => {
const a = document.createElement('a');
a.setAttribute('href', string);
const {host, hostname, pathname, port, protocol, search, hash} = a;
const origin = `${protocol}//${hostname}${port.length ? `:${port}`:''}`;
return prop ? eval(prop) : {origin, host, hostname, pathname, port, protocol, search, hash}
}
Then
parseUrl('http://mysite:5050/pke45#23')
// {origin: "http://mysite:5050", host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050"…}
parseUrl('http://mysite:5050/pke45#23', 'origin')
// "http://mysite:5050"
Cool!
Upvotes: 16
Reputation: 1817
This, works for me:
var getBaseUrl = function (url) {
if (url) {
var parts = url.split('://');
if (parts.length > 1) {
return parts[0] + '://' + parts[1].split('/')[0] + '/';
} else {
return parts[0].split('/')[0] + '/';
}
}
};
Upvotes: 0
Reputation: 34498
I use a simple regex that extracts the host form the url:
function get_host(url){
return url.replace(/^((\w+:)?\/\/[^\/]+\/?).*$/,'$1');
}
and use it like this
var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/'
var host = get_host(url);
Note, if the url
does not end with a /
the host
will not end in a /
.
Here are some tests:
describe('get_host', function(){
it('should return the host', function(){
var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/';
assert.equal(get_host(url),'http://www.sitename.com/');
});
it('should not have a / if the url has no /', function(){
var url = 'http://www.sitename.com';
assert.equal(get_host(url),'http://www.sitename.com');
});
it('should deal with https', function(){
var url = 'https://www.sitename.com/article/2009/09/14/this-is-an-article/';
assert.equal(get_host(url),'https://www.sitename.com/');
});
it('should deal with no protocol urls', function(){
var url = '//www.sitename.com/article/2009/09/14/this-is-an-article/';
assert.equal(get_host(url),'//www.sitename.com/');
});
it('should deal with ports', function(){
var url = 'http://www.sitename.com:8080/article/2009/09/14/this-is-an-article/';
assert.equal(get_host(url),'http://www.sitename.com:8080/');
});
it('should deal with localhost', function(){
var url = 'http://localhost/article/2009/09/14/this-is-an-article/';
assert.equal(get_host(url),'http://localhost/');
});
it('should deal with numeric ip', function(){
var url = 'http://192.168.18.1/article/2009/09/14/this-is-an-article/';
assert.equal(get_host(url),'http://192.168.18.1/');
});
});
Upvotes: 7
Reputation: 3422
WebKit-based browsers, Firefox as of version 21 and current versions of Internet Explorer (IE 10 and 11) implement location.origin
.
location.origin
includes the protocol, the domain and optionally the port of the URL.
For example, location.origin
of the URL http://www.sitename.com/article/2009/09/14/this-is-an-article/
is http://www.sitename.com
.
To target browsers without support for location.origin
use the following concise polyfill:
if (typeof location.origin === 'undefined')
location.origin = location.protocol + '//' + location.host;
Upvotes: 154
Reputation: 17057
If you are extracting information from window.location.href (the address bar), then use this code to get http://www.sitename.com/
:
var loc = location;
var url = loc.protocol + "//" + loc.host + "/";
If you have a string, str
, that is an arbitrary URL (not window.location.href), then use regular expressions:
var url = str.match(/^(([a-z]+:)?(\/\/)?[^\/]+\/).*$/)[1];
I, like everyone in the Universe, hate reading regular expressions, so I'll break it down in English:
No need to create DOM elements or do anything crazy.
Upvotes: 8
Reputation: 362
You can use below codes for get different parameters of Current URL
alert("document.URL : "+document.URL);
alert("document.location.href : "+document.location.href);
alert("document.location.origin : "+document.location.origin);
alert("document.location.hostname : "+document.location.hostname);
alert("document.location.host : "+document.location.host);
alert("document.location.pathname : "+document.location.pathname);
Upvotes: 7
Reputation: 1395
function getBaseURL() {
var url = location.href; // entire url including querystring - also: window.location.href;
var baseURL = url.substring(0, url.indexOf('/', 14));
if (baseURL.indexOf('http://localhost') != -1) {
// Base Url for localhost
var url = location.href; // window.location.href;
var pathname = location.pathname; // window.location.pathname;
var index1 = url.indexOf(pathname);
var index2 = url.indexOf("/", index1 + 1);
var baseLocalUrl = url.substr(0, index2);
return baseLocalUrl + "/";
}
else {
// Root Url for domain name
return baseURL + "/";
}
}
You then can use it like this...
var str = 'http://en.wikipedia.org/wiki/Knopf?q=1&t=2';
var url = str.toUrl();
The value of url will be...
{
"original":"http://en.wikipedia.org/wiki/Knopf?q=1&t=2",<br/>"protocol":"http:",
"domain":"wikipedia.org",<br/>"host":"en.wikipedia.org",<br/>"relativePath":"wiki"
}
The "var url" also contains two methods.
var paramQ = url.getParameter('q');
In this case the value of paramQ will be 1.
var allParameters = url.getParameters();
The value of allParameters will be the parameter names only.
["q","t"]
Tested on IE,chrome and firefox.
Upvotes: 3
Reputation: 2364
A lightway but complete approach to getting basic values from a string representation of an URL is Douglas Crockford's regexp rule:
var yourUrl = "http://www.sitename.com/article/2009/09/14/this-is-an-article/";
var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;
var parts = parse_url.exec( yourUrl );
var result = parts[1]+':'+parts[2]+parts[3]+'/' ;
If you are looking for a more powerful URL manipulation toolkit try URI.js It supports getters, setter, url normalization etc. all with a nice chainable api.
If you are looking for a jQuery Plugin, then jquery.url.js should help you
A simpler way to do it is by using an anchor element, as @epascarello suggested. This has the disadvantage that you have to create a DOM Element. However this can be cached in a closure and reused for multiple urls:
var parseUrl = (function () {
var a = document.createElement('a');
return function (url) {
a.href = url;
return {
host: a.host,
hostname: a.hostname,
pathname: a.pathname,
port: a.port,
protocol: a.protocol,
search: a.search,
hash: a.hash
};
}
})();
Use it like so:
paserUrl('http://google.com');
Upvotes: 10
Reputation: 101
Instead of having to account for window.location.protocol and window.location.origin, and possibly missing a specified port number, etc., just grab everything up to the 3rd "/":
// get nth occurrence of a character c in the calling string
String.prototype.nthIndex = function (n, c) {
var index = -1;
while (n-- > 0) {
index++;
if (this.substring(index) == "") return -1; // don't run off the end
index += this.substring(index).indexOf(c);
}
return index;
}
// get the base URL of the current page by taking everything up to the third "/" in the URL
function getBaseURL() {
return document.URL.substring(0, document.URL.nthIndex(3,"/") + 1);
}
Upvotes: 3
Reputation: 501
If you're using jQuery, this is a kinda cool way to manipulate elements in javascript without adding them to the DOM:
var myAnchor = $("<a />");
//set href
myAnchor.attr('href', 'http://example.com/path/to/myfile')
//your link's features
var hostname = myAnchor.attr('hostname'); // http://example.com
var pathname = myAnchor.attr('pathname'); // /path/to/my/file
//...etc
Upvotes: 12
Reputation: 10536
You can do it using a regex :
/(http:\/\/)?(www)[^\/]+\//i
does it fit ?
Upvotes: 1