Memento Mori
Memento Mori

Reputation: 3402

Java file IO truncated while reading large files using BufferedInputStream

I have a function in which I am only given a BufferedInputStream and no other information about the file to be read. I unfortunately cannot alter the method definition as it is called by code I don't have access to. I've been using the code below to read the file and place its contents in a String:

public String[] doImport(BufferedInputStream stream) throws IOException, PersistenceException {
    int bytesAvail = stream.available();
    byte[] bytesRead = new byte[bytesAvail];
    stream.read(bytesRead);
    stream.close();
    String fileContents = new String(bytesRead);
    //more code here working with fileContents
}

My problem is that for large files (>2Gb), this code causes the program to either run extremely slowly or truncate the data, depending on the computer the program is executed on. Does anyone have a recommendation regarding how to deal with large files in this situation?

Upvotes: 0

Views: 2048

Answers (2)

Jacob
Jacob

Reputation: 1749

I'm not sure why you don't think you can read it line-by-line. BufferedInputStream only describes how the underlying stream is accessed, it doesn't impose any restrictions on how you ultimately read data from it. You can use it just as if it were any other InputStream.

Namely, to read it line-by-line you could do

InputStreamReader streamReader = new InputStreamReader(stream);
BufferedInputReader lineReader = new BufferedInputReader(streamReader);
String line = lineReader.readLine();
...

[Edit] This response is to the original wording of the question, which asked specifically for a way to read the input file line-by-line.

Upvotes: 0

Ernest Friedman-Hill
Ernest Friedman-Hill

Reputation: 81724

You're assuming that available() returns the size of the file; it does not. It returns the number of bytes available to be read, and that may be any number less than or equal to the size of the file.

Unfortunately there's no way to do what you want in just one shot without having some other source of information on the length of the file data (i.e., by calling java.io.File.length()). Instead, you have to possibly accumulate from multiple reads. One way is by using ByteArrayOutputStream. Read into a fixed, finite-size array, then write the data you read into a ByteArrayOutputStream. At the end, pull the byte array out. You'll need to use the three-argument forms of read() and write() and look at the return value of read() so you know exactly how many bytes were read into the buffer on each call.

Upvotes: 1

Related Questions