Reputation: 317
I am having a problem with regular expression for getting the main domain name from a url. that is if i am having urls as given below..
http://domain.com/return/java.php?hello.asp
http://www.domain.com/return/java.php?hello.asp
http://blog.domain.net/return/java.php?hello.asp
http://us.blog.domain.co.us/return/java.php?hello.asp
http://domain.co.uk
http://domain.net
http://www.blog.domain.co.ca/return/java.php?hello.asp
http://us.domain.com/return/
from all this I should only get domain as the output of the regular expression.. so how do i do it? i used;
var url = urls.match(/[^.]*.(com|net|org|info|coop|int|co\.uk|org\.uk|ac\.uk|uk)/g);
but it does not work for
http://domain.net
so can someone help me out with this.
Upvotes: 1
Views: 1405
Reputation: 14925
Here is a solution changing the regex a bit:
url.match(/https?:\/\/[^/]+((?=\/)|$)/g);
//tested with Chrome 38+ on Win7
Basiclly checking for slash /
or string end $
Update replaced jsFiddle link with inline Stackoverflow-Code:
var urls = ['http://domain.com/return/java.php?hello.asp',
'http://www.domain.com/return/java.php?hello.asp',
'http://blog.domain.net/return/java.php?hello.asp',
'http://us.blog.domain.co.us/return/java.php?hello.asp',
'http://domain.co.uk',
'http://domain.net',
'http://www.blog.domain.co.ca/return/java.php?hello.asp',
'http://us.domain.com/return/'
];
var htmlConsole = document.getElementById("result");
var htmlTab = " ";
var htmlNewLine = "<br />";
htmlConsole.innerHTML = "";
for (var id in urls) {
htmlConsole.innerHTML += "URL: " + urls[id] + htmlNewLine;
var matchResults = urls[id].match(/https?:\/\/[^/]+((?=\/)|$)/g);
for (var innerIdx in matchResults) {
htmlConsole.innerHTML += htmlTab + "MatchNumber: " + innerIdx + " MatchValue: " + matchResults[innerIdx] + htmlNewLine;
}
htmlConsole.innerHTML += htmlNewLine;
}
<div id="result">
</div>
Upvotes: 0
Reputation: 1767
would this help ?
(http|https|ftp):\/\/([a-zA-Z0-9.])+/g
matches at
http://domain.com
http://www.domain.com
http://blog.domain.net
http://us.blog.domain.co.us
http://domain.co.uk
http://domain.net
http://www.blog.domain.co.ca
http://us.domain.com
Upvotes: 0
Reputation: 4574
var url = urls.match(/[^./]*.(com|net|org|info|coop|int|co\.uk|co\.us|co\.ca|org\.uk|ac\.uk|uk)/g);
just added a /
and updated the list of top-level domains to match your examples.
Although I do not recommend to keep the list of top-level domains within a regexp. it's just too many. http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains
Upvotes: -1
Reputation: 26687
You can use URL
rather than regex
var url = new URL("http://domain.com/return/java.php?hello.asp");
console.log(url.hostname);
=> domain.com
OR
If you want the protocol as well
var url = new URL("http://domain.com/return/java.php?hello.asp");
console.log(url.protocol+"//"+url.hostname);
= > http://domain.com
Upvotes: 4