YosiFZ
YosiFZ

Reputation: 7900

Java download html

I am trying do download the html of a website:

    String encoding = "UTF-8";

HttpContext localContext = new BasicHttpContext();

    HttpClient httpclient = new DefaultHttpClient();

HttpGet httpget = new HttpGet(MYURL);

httpget.setHeader("User-Agent", "Mozilla/5.0 (iPhone; CPU iPhone OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3");


HttpResponse response = httpclient.execute(httpget, localContext);

HttpEntity entity = response.getEntity();

InputStream instream = entity.getContent();

String html = getStringFromInputStream(encoding, instream);

And in the and of the html string i get:

...
21912
0
0

And i don't get the full html,any idea how to fix?

EDIT

private static String getStringFromInputStream(String encoding, InputStream instream) throws UnsupportedEncodingException, IOException {

Writer writer = new StringWriter();


char[] buffer = new char[1024];

try {

Reader reader = new BufferedReader(new InputStreamReader(instream, encoding));

int n;

while ((n = reader.read(buffer)) != -1) {

writer.write(buffer, 0, n);

}

} finally {

instream.close();

}

String result = writer.toString();

return result;
}

Upvotes: 0

Views: 127

Answers (1)

Buhake Sindi
Buhake Sindi

Reputation: 89169

I would suggest rather use EntityUtils:

HttpEntity entity = response.getEntity();
String html = EntityUtils.toString(entity);

or

HttpEntity entity = response.getEntity();
String html = EntityUtils.toString(entity, encoding);

Upvotes: 1

Related Questions