Reputation: 55
I am writing a program in java that tests the validity of many websites. My plan is to get the URL, append http on the front and use HttpURLConnection class.
My problem is that I sometimes get 403 forbidden for the response code. Is there any way around this? If I get a 403 forbidden would that mean that the website is indeed valid? I've pasted the URL with a 403 code into the browser and was able to connect just fine.
Another problem is that I often get 301, 302, and 303 which I know are related to redirects. I then get the redirect url from the "Location" key in the head. When connecting to these I then get an error related to certificate chaining. I believe this can be solved by using a KeyStore that contains a list of certificates or certificate issuers that we deem valid. Does that sound right?
Thanks.
I don't have my code on this PC but I will try to recreate it.
pingSuccess = false;
HttpUrlConnection connection = (HttpUrlConnection) new URL(urlString).openConnection();
int response = connection.getResponseCode();
if(response == 301 || response == 302 || response == 303) {
String newUrl = connection.getHeaderSomething("Location");
connection = (HttpUrlConnection) new URL(newUrl).openConnection();
response = connection.getResponseCode();
if(response == 200)
pingSuccess = true;
}
return pingSuccess;
Upvotes: 2
Views: 149
Reputation: 55
I was googling around and found this. I set this on the connection and I am able to get a 200 response (good) for a website that was previously giving a 403, even though accessing the website in a browser was fine.
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36");
Upvotes: 1
Reputation: 2208
403 - Forbidden Access
This indicates a fundamental access problem, which may be difficult to resolve because the HTTP protocol allows the Web server to give this response without providing any reason at all. So the 403 error is equivalent to a blanket 'NO' by the Web server - with no further discussion allowed.
By far the most common reason for this error is that directory browsing is forbidden for the Web site. Most Web sites want you to navigate using the URLs in the Web pages for that site. They do not often allow you to browse the file directory structure of the site. For example try the following URL (then hit the 'Back' button in your browser to return to this page):
http://www.browsesites.com/accounts/B1394343/
This URL should fail with a 403 error saying "Forbidden: You don not have permission to access /accounts/B1394343/ on this server". This is because our browsesites Web site deliberately does not want you to browse directories - you have to navigate from one specific Web page to another using the hyperlinks in those Web pages. This is true for most Web sites on the Internet - their Web server has "Allow directory browsing" set OFF.
You first need to confirm if you have encountered a "No directory browsing" problem. You can see this if the URL ends in a slash '/' rather than the name of a specific Web page (e.g. .htm or .html). If this is your problem, then you have no option but to access individual Web pages for that Web site directly.
Tried your code with few minor edits. Posting the code:
package general;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.UnknownHostException;
import javax.net.ssl.SSLSocket;
import javax.net.ssl.SSLSocketFactory;
public class HTTPLinkTest {
public static boolean testLink(String urlLink){
boolean pingSuccess = false;
try {
System.out.println("Test validity of URL:" + urlLink);
URL myUrl = new URL(urlLink);
HttpURLConnection connection;
connection = (HttpURLConnection) myUrl.openConnection();
int response = connection.getResponseCode();
if(response == 301 || response == 302 || response == 303) {
String newUrl = connection.getHeaderField("Location");
System.out.println("Got redirected to new URL:" + newUrl);
connection = (HttpURLConnection) new URL(newUrl).openConnection();
response = connection.getResponseCode();
// Request has succeeded
if(response == 200)
pingSuccess = true;
}
}catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return pingSuccess;
}
public static boolean testSSLConnection(String sslLink){
try {
SSLSocketFactory factory = (SSLSocketFactory)SSLSocketFactory.getDefault();
String host = sslLink;
int port = 443;
System.out.println("Creating secure socket to " + host + ":" + port);
SSLSocket socket = (SSLSocket) factory.createSocket(host, port);
String[] suites = socket.getSupportedCipherSuites();
System.out.println("Supported suites are:");
for (String suite : suites){
System.out.println(suite);
}
} catch (UnknownHostException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return true;
}
public static void main(String[] args) {
String[] sslLinks = {"netbanking.hdfcbank.com"};
for (String sslLink: sslLinks){
testSSLConnection(sslLink);
}
String[] links = {"http://www.yahoo.com" , "http://www.yahoo.com/book"};
for (String link : links){
System.out.println("Test Result: " + link + (testLink(link) ? " is Valid URL":" is Invalid URL"));
System.out.println();
}
}
}
Output: Creating secure socket to netbanking.hdfcbank.com:443
Supported suites are: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 TLS_RSA_WITH_AES_128_CBC_SHA256 TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256 TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256 TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 TLS_DHE_DSS_WITH_AES_128_CBC_SHA256 TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA TLS_RSA_WITH_AES_128_CBC_SHA TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA TLS_ECDH_RSA_WITH_AES_128_CBC_SHA TLS_DHE_RSA_WITH_AES_128_CBC_SHA TLS_DHE_DSS_WITH_AES_128_CBC_SHA TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 TLS_RSA_WITH_AES_128_GCM_SHA256 TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256 TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256 TLS_DHE_RSA_WITH_AES_128_GCM_SHA256 TLS_DHE_DSS_WITH_AES_128_GCM_SHA256 TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA SSL_RSA_WITH_3DES_EDE_CBC_SHA TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA TLS_EMPTY_RENEGOTIATION_INFO_SCSV TLS_DH_anon_WITH_AES_128_GCM_SHA256 TLS_DH_anon_WITH_AES_128_CBC_SHA256 TLS_ECDH_anon_WITH_AES_128_CBC_SHA TLS_DH_anon_WITH_AES_128_CBC_SHA TLS_ECDH_anon_WITH_3DES_EDE_CBC_SHA SSL_DH_anon_WITH_3DES_EDE_CBC_SHA SSL_RSA_WITH_DES_CBC_SHA SSL_DHE_RSA_WITH_DES_CBC_SHA SSL_DHE_DSS_WITH_DES_CBC_SHA SSL_DH_anon_WITH_DES_CBC_SHA SSL_RSA_EXPORT_WITH_DES40_CBC_SHA SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA SSL_DHE_DSS_EXPORT_WITH_DES40_CBC_SHA SSL_DH_anon_EXPORT_WITH_DES40_CBC_SHA TLS_RSA_WITH_NULL_SHA256 TLS_ECDHE_ECDSA_WITH_NULL_SHA TLS_ECDHE_RSA_WITH_NULL_SHA SSL_RSA_WITH_NULL_SHA TLS_ECDH_ECDSA_WITH_NULL_SHA TLS_ECDH_RSA_WITH_NULL_SHA TLS_ECDH_anon_WITH_NULL_SHA SSL_RSA_WITH_NULL_MD5 TLS_KRB5_WITH_3DES_EDE_CBC_SHA TLS_KRB5_WITH_3DES_EDE_CBC_MD5 TLS_KRB5_WITH_DES_CBC_SHA TLS_KRB5_WITH_DES_CBC_MD5 TLS_KRB5_EXPORT_WITH_DES_CBC_40_SHA TLS_KRB5_EXPORT_WITH_DES_CBC_40_MD5 Test validity of URL:http://www.yahoo.com Got redirected to new URL:https://www.yahoo.com/
Upvotes: 0