Reputation: 2656
I'm trying to get the base url from a string (So no window.location).
In other words all the following should return https://apple.com
or https://www.apple.com
for the last one.
https://apple.com?query=true&slash=false
https://apple.com#anchor=true&slash=false
http://www.apple.com/#anchor=true&slash=true&whatever=foo
These are just examples, urls can have different subdomains like https://shop.apple.co.uk/?query=foo
should return https://shop.apple.co.uk
- It could be any url like: https://foo.bar
The closer I got is with:
const baseUrl = url.replace(/^((\w+:)?\/\/[^\/]+\/?).*$/,'$1').replace(/\/$/, ""); // Base Path & Trailing slash
But this doesn't work with anchor links and queries which start right after the url without the /
before
Any idea how I can get it to work on all cases?
Upvotes: 2
Views: 5007
Reputation: 5303
You could use Web API's built-in URL for this. URL will also provide you with other parsed properties that are easy to get to, like the query string params, the protocol, etc.
Regex is a painful way to do something that the browser makes otherwise very simple.
I know that you asked about using regex, but in the event that you (or someone coming here in the future) really just cares about getting the information out and isn't committed to using regex, maybe this answer will help.
let one = "https://apple.com?query=true&slash=false"
let two = "https://apple.com#anchor=true&slash=false"
let three = "http://www.apple.com/#anchor=true&slash=true&whatever=foo"
let urlOne = new URL(one)
console.log(urlOne.origin)
let urlTwo = new URL(two)
console.log(urlTwo.origin)
let urlThree = new URL(three)
console.log(urlThree.origin)
Upvotes: 4
Reputation: 2517
This will get you everything up to the .com part. You will have to append .com once you pull out the first part of the url.
^http.*?(?=\.com)
Or maybe you could do:
myUrl.Replace(/(#|\?|\/#).*$/, "")
To remove everything after the host name.
Upvotes: 0
Reputation: 163207
You could add #
and ?
to your negated character class. You don't need .*
because that will match until the end of the string.
For your example data, you could match:
^https?:\/\/[^#?\/]+
strings = [
"https://apple.com?query=true&slash=false",
"https://apple.com#anchor=true&slash=false",
"http://www.apple.com/#anchor=true&slash=true&whatever=foo",
"https://foo.bar/?q=true"
];
strings.forEach(s => {
console.log(s.match(/^https?:\/\/[^#?\/]+/)[0]);
})
Upvotes: 4