Reputation: 3329
I access a webpage by passing the session id and url and output is a HTML response. I want to use jSoup to parse this response and get the tag elements. I see the examples in Jsoup takes a String for establishing connection. How do i proceed.
pseudo code:
I tried the above method and got this exception
java.io.IOException: 401 error loading URL http://www.abc.com/index
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:387)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:364)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:132)
Basically the entity.getContent()
has the HTML response which has to be passed as a String to the connect method. But it doesn't work.
Upvotes: 0
Views: 3662
Reputation: 1109142
Apache Commons HttpClient and Jsoup do not share the same cookie store. You basically need to pass the very same cookies as HttpClient has retrieved back through Jsoup's Connection
. You can find some concrete examples here:
Alternatively, you can also just continue using HttpClient for firing HTTP requests and maintaining the cookies and instead feeds its HttpResponse
as String
through Jsoup#parse()
.
So this should do:
HttpResponse httpResponse = httpclient1.execute(httpget, httpContext);
String html = EntityUtils.toString(httpResponse.getEntity());
Document doc = Jsoup.parse(html, testUrl);
// ...
By the way, you do not necessarily need to create a whole new HttpClient
for a subsequent request. Just reuse httpclient
which you already created. Also your way of obtaining the response as String
is clumsy. The second line in the above example shows how to do it at simplest.
Upvotes: 1
Reputation: 49577
It shows an http error 401 which means
Similar to 403 Forbidden, but specifically for use when authentication is possible but has failed or not yet been provided
.
Therefore, i think you need to login into the website using your java code or identify yourself by sending cookies through your code.
Upvotes: 0