Reputation: 508
I am trying to write a Java program that will load pages pointed to by valid links and report other links as broken. My problem is that the Java URL will download the appropriate page if the url is valid, and the search-engine results for the url if the url is invalid.
Is there a Java function that detects if the url resolves to a legitimate page . . . thanks very much,
Joel
Upvotes: 1
Views: 109
Reputation: 156662
You can get the HTTP response code for a URL like so:
public static int getResponseCode(URL url) throws IOException {
URLConnection conn = url.openConnection();
if (!(conn instanceof HttpURLConnection)) {
throw new IllegalArgumentException("not an HTTP url: " + url);
}
HttpURLConnection httpConn = (HttpURLConnection) conn;
return httpConn.getResponseCode();
}
Now the question is, what do you consider a "valid" webpage? For me, if a URL parses correctly and it's protocol is "http" (or https) and it's response code is in the 200 block or 302 (Found/Redirect) or 304 (Not modified), then it's valid:
public boolean isValidHttpResponseCode(int code) {
return ((code / 100) == 2) || (code == 302) || (code == 304);
}
Upvotes: 1