Reputation: 527
I'm using HttpURLConnection
to validate URLs coming out of a database. Sometimes with certain URLs I will get an exception, I assume they are timing out but are in fact reachable (no 400 range error).
Increasing the timeout doesn't seem to matter, I still get an exception. Is there a second check I could do in the catch region to verify if in fact the URL is bad? The relevant code is below. It works with 99.9% of URLs, it's that .01%.
try {
HttpURLConnection connection = (HttpURLConnection) new URL(url).openConnection();
connection.setConnectTimeout(timeout);
connection.setReadTimeout(timeout);
connection.setRequestMethod("GET");
connection.setRequestProperty("User-Agent",
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.13) Gecko/2009073021 Firefox/3.0.13");
connection.connect () ;
int responseCode = connection.getResponseCode();
if (responseCode >= 401)
{
String prcMessage = "ERROR: URL " + url + " not found, response code was " + responseCode + "\r";
System.out.println(prcMessage);
VerifyUrl.writeToFile(prcMessage);
return (false);
}
}
catch (IOException exception)
{
String errorMessage = ("ERROR: URL " + url + " did not load in the given time of " + timeout + " milliseconds.");
System.out.println(errorMessage);
VerifyUrl.writeToFile(errorMessage);
return false;
}
Upvotes: 1
Views: 3298
Reputation: 345
Depends on what you want to check. But i guess Validating URL in Java got you covered.
You got two possiblities:
Check syntax ("Is this URL a real URL or just made up?")
There is a large amount of text which describes how to do it. Basically search for RFC 3986. I guess someone has implemented a check like this already.
Check the semantics ("Is the URL available?")
There is not really a faster way to do that though there are different tools available for sending a http request in java. You may send a HEAD request instead of GET as HEAD omits the HTTP body and may result in faster requests and less timeouts.
Upvotes: 2