Literati Insolitus
Literati Insolitus

Reputation: 508

Java function to detect valid webpage

I am trying to write a Java program that will load pages pointed to by valid links and report other links as broken. My problem is that the Java URL will download the appropriate page if the url is valid, and the search-engine results for the url if the url is invalid.

Is there a Java function that detects if the url resolves to a legitimate page . . . thanks very much,

Joel

Upvotes: 1

Views: 109

Answers (2)

maerics
maerics

Reputation: 156662

You can get the HTTP response code for a URL like so:

public static int getResponseCode(URL url) throws IOException {
  URLConnection conn = url.openConnection();
  if (!(conn instanceof HttpURLConnection)) {
    throw new IllegalArgumentException("not an HTTP url: " + url);
  }
  HttpURLConnection httpConn = (HttpURLConnection) conn;
  return httpConn.getResponseCode();
}

Now the question is, what do you consider a "valid" webpage? For me, if a URL parses correctly and it's protocol is "http" (or https) and it's response code is in the 200 block or 302 (Found/Redirect) or 304 (Not modified), then it's valid:

public boolean isValidHttpResponseCode(int code) {
    return ((code / 100) == 2) || (code == 302) || (code == 304);
}

Upvotes: 1

Val
Val

Reputation: 173

HttpURLConnection#getResponseCode will give you an HTTP status code

Upvotes: 2

Related Questions