Namalak
Namalak

Reputation: 165

Missing and unexpected chars in reading a large input stream using BufferedInputStream in java

I have to do read an Large InputStream comming from a URL. I loaded the InputStream to the BufferedInputStream and read it to a byte[ ] and I append that byte [] to a StringBuilder converting it to a string. After all data has been appended to the StringBuilder, the resulting String contains some missing and unexpected chars. I didn't use any encoding (Eg. UTF-8) here since the response is coming in the similar format I expected.

Can you give any suggestions to solve this?

Code :

    BufferedInputStream brIn    = new BufferedInputStream(connection.getInputStream());
    StringBuilder response      = new StringBuilder(1000);

    byte[] byteBfr  = new byte[8192];
    int n=0;

    while((n=brIn.read(byteBfr,0,byteBfr.length)) != -1){
        response.append(new String(byteBfr).toCharArray(),0,n);
    }

    return  response.toString();

Output : This is a part of the resulting response. The complete one contains about 554595 lines.

Expected Result :

  <Hotel>
    <CiID>31</CiID>
    <HoID>58617</HoID>
    <Name>HARRY΄S</Name>
    <Address>PROTARAS</Address>
    <Phone>00357 23 834100</Phone>
    <Fax>0035723831860</Fax>
    <Stars>3</Stars>
  </Hotel>

Actual Result :

  <Hotel>
    <CiID>31</CiID>
    <HoID>58617</HoID>
    <Name>HARRY΄S</Name>
    <Address>PROTARAS</AdAdress>
 <   <Phone>00357 23 834100</Phone>
    <Fa9x>00390<P654224546</Fax>
    <Stars>3</Stars>
  </Hotel>

In the above one you can see the unexpected chars in the Address, Fax and in the Phone.

Upvotes: 1

Views: 1418

Answers (1)

Ben Taitelbaum
Ben Taitelbaum

Reputation: 7403

Since you're reading in the entire string at once (as opposed to processing it as it arrives), consider using a BufferedReader.

import java.io.*;
import java.net.*;

public class UrlReading {
  public static void main(String[] args) throws Exception {
    URL url = new URL("http://google.com");
    BufferedReader reader = new BufferedReader(
            new InputStreamReader(url.openConnection().getInputStream(), "UTF-8"));
    String inputLine;
    while( (inputLine = reader.readLine()) != null) {
      System.out.println(reader.readLine());
    }
  }
}

Alternately, if you're reading in xml, consider using a solution that will let you parse the xml, like:

Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse("http://google.com");

Upvotes: 2

Related Questions