user3421917
user3421917

Reputation: 1

Cannot Read a URL from java code

I'm desperate enough to get the content of the this URL.

No authentication is required when trying to access this page from a web browser but when I'm trying to get the content from a web application I get sso file as a response. The code I used is as follows:

HttpClient httpClient = new DefaultHttpClient();
HttpGet httpGet = new HttpGet("http://search.lib.monash.edu/primo_library/libweb/action/search.do?dscnt=1&frbg=&tab=default_tab&srt=rank&ct=search&mode=Basic&dum=true&tb=&indx=1&vl%28freeText0%29=java&fn=search&vid=MON");
HttpResponse httpResponse = httpClient.execute(httpGet);
HttpEntity responseEntity = httpResponse.getEntity();


BufferedReader in = new BufferedReader(
            new InputStreamReader(responseEntity.getContent()));
    String inputLine;
    StringBuffer response = new StringBuffer();


    while ((inputLine = in.readLine()) != null) {
        response.append(inputLine);
    }
    in.close();

    System.out.println(response.toString());    

and the sso file I get as response is as follows:

<!-- filename: sso --> <html> <head> <title>Login </title> <!-- START filename: meta-tags.pds --> <META HTTP-EQUIV="Cache-Control" CONTENT="no-cache">  <META HTTP-EQUIV="Pragma" CONTENT="no-cache">  <META HTTP-EQUIV="Expires" CONTENT="Sun, 06 Nov 1994 08:49:37 GMT">  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"> <!-- END   filename: meta-tags.pds --> <link rel="stylesheet" href="http://monash-dc05.hosted.exlibrisgroup.com:8991/PDSMExlibris.css" TYPE="text/css"> </head> <body onload = "location = '/goto/http://search.lib.monash.edu:80/primo_library/libweb/action/login.do?afterPDS=true&vid=MON&vid=MON&dscnt=2&targetURL=http%3A%2F%2Fsearch.lib.monash.edu%2Fprimo_library%2Flibweb%2Faction%2Fsearch.do%3Fdscnt%3D0&frbg=&tab=default%5Ftab&dstmp=1394940513823&srt=rank&ct=search&mode=Basic&dum=true&indx=1&tb=&vl%28freeText0%29=java&fn=search&pds_handle=GUEST';"> <noscript> <div id="header">      <div>         <img src="http://monash-dc05.hosted.exlibrisgroup.com:8991//exlibris/primo/p4_1/pds/html_form/icon/exlibrislogo.jpg" alt="Exlibris Logo"><p>&nbsp;</p>     </div> </div> <div id="connect">  <a href="/goto/http://search.lib.monash.edu:80/primo_library/libweb/action/login.do?afterPDS=true&vid=MON&vid=MON&dscnt=2&targetURL=http%3A%2F%2Fsearch.lib.monash.edu%2Fprimo_library%2Flibweb%2Faction%2Fsearch.do%3Fdscnt%3D0&frbg=&tab=default%5Ftab&dstmp=1394940513823&srt=rank&ct=search&mode=Basic&dum=true&indx=1&tb=&vl%28freeText0%29=java&fn=search&pds_handle=GUEST">Return from Check SSO </a></noscript> </div> </body> </html></body></html>

Please help.

Upvotes: 0

Views: 253

Answers (1)

Ravinder Reddy
Ravinder Reddy

Reputation: 23982

It was not because of any authentication issue.

The page returned has a onload event associated with the body. Due to the reason, when you open referred URL in a browser client,

  1. It first receives the response html what you have in response string.
  2. Then it tries to render and display it.
  3. But, in the mean time, the onload event fires and loads a URL as defined by location='/goto/......
  4. And, before the current page is displayed, the new page is received and displayed on the browser.

From the response you received, observe this:

<body onload = "location = '/goto/http://search.lib.monash.edu:80/primo_library/libweb/action/login.do?afterPDS=true&vid=MON&vid=MON&dscnt=2&targetURL=http%3A%2F%2Fsearch.lib.monash.edu%2Fprimo_library%2Flibweb%2Faction%2Fsearch.do%3Fdscnt%3D0&frbg=&tab=default%5Ftab&dstmp=1394940513823&srt=rank&ct=search&mode=Basic&dum=true&indx=1&tb=&vl%28freeText0%29=java&fn=search&pds_handle=GUEST';">

In the JAVA code, you are just reading the content from the URL you specified.
And you are not passing it to any content parser to render and display. Unless which it just will be treated as a static text.

And hence you are not seeing a response in JAVA code as compared and seen in a web browser.

Other suggestions:
When you read a line and append it to a buffer, you better also append a CRLF to it.

Change:

    response.append(inputLine);

To:

    response.append( inputLine ).append( "\r\n" );

It makes the response text multi line and more readable.

Upvotes: 1

Related Questions