Reputation: 101
I just want to show the website name only.
I don't want to show ".com" or "us.cnn.com" or "www.bbc.co.uk" Just the name of the website Like "cnn" or "bbc" only.
My code:
private String getHostName(String urlInput) {
urlInput = urlInput.toLowerCase();
String hostName = urlInput;
if (!urlInput.equals("")) {
if (urlInput.startsWith("http") || urlInput.startsWith("https")) {
try {
URL netUrl = new URL(urlInput);
String host = netUrl.getHost();
if (host.startsWith("www")) {
hostName = host.substring("www".length() + 1);
} else {
hostName = host;
}
} catch (MalformedURLException e) {
hostName = urlInput;
}
} else if (urlInput.startsWith("www")) {
hostName = urlInput.substring("www".length() + 1);
}
return hostName;
} else {
return "";
}
}
Inputs
http://www.bbc.co.uk/news/world-us-canada-39018776"
http://us.cnn.com/2017/02/18/politics/john-mccain-donald-trump-dictators/index.html"
http://bigstory.ap.org/article/d5dd5962fc4d42b195117ca63e0ba9af/revived-rally-trump-turns-back-governing
Outputs
www.bbc.co.uk
us.cnn.com
bigstory.ap.org
I just want to extract the "bbc", "cnn" and "ap" name from it.
Upvotes: 1
Views: 1170
Reputation: 23404
String mainUrl;
urlInput = urlInput.toLowerCase();
String hostName = urlInput;
String[] suburls = hostName.split("\\.");
mainUrl=suburl[0]
if(suburls[0].contains("www")){
mainUrl=suburl[1];
}
if(mainUrl.contains("http://"))
mainUrl.replace("http://","");
else if(mainUrl.contains("https://")
mainUrl.replace("https://","");
now the result should be in mainUrl
Upvotes: 0
Reputation: 532
First convert your website URL to URI:
public static String getDomainName(String url) throws URISyntaxException {
URI uri = new URI(url);
String domain = uri.getHost();
return domain.startsWith("www.") ? domain.substring(4) : domain;
}
Upvotes: -1
Reputation: 3417
You can use the java.net.URI
-class to extract the hostname from the string.
Example code :
public String getHostName(String url) {
URI uri = new URI(url);
String hostname = uri.getHost();
// to provide faultproof result, check if not null then return only hostname, without www.
if (hostname != null) {
return hostname.startsWith("www.") ? hostname.substring(4) : hostname;
}
return hostname;
}
This above gives you the hostname, and is faultproof if your hostname does start with either google.com/...
or www.google.com/...
, which will return with 'google'.
If the given url
is invalid (undefined hostname), it returns with null.
Upvotes: 3