Dominic Bou-Samra
Dominic Bou-Samra

Reputation: 15406

Writing large Byte Arrays out on the fly (so memory doesn't become an issue)

I am reading some large files in as byte arrays and adding them to an arraylist. Bad... I end up with 100mb files IN MEMORY. I need them to be processed and output line by line (or every x number of lines). How do I do this? BufferedWriter doesn't take ByteArrays.

Upvotes: 0

Views: 1000

Answers (5)

Konrad Garus
Konrad Garus

Reputation: 54005

Dealing with data "on the fly" like this is called streaming. Data arrives from one stream, is processed, and saved to another stream, and you don't need to fit everything in memory.

If it's directly from database, ResultSet does not fetch all result rows at once. You can safely iterate over it in a loop and process rows "on the fly".

If it's a file, you can read it with a stream (e.g. FileInputStream) and prefetch smaller chunks. If it's a text file, use a Reader for all the automatic buffering and splitting on line breaks.

Wherever you get this data from, you can process it "on the fly" and write to output. Again, don't write to a byte array, but to stream - such as FileOutputStream, or a Writer.

Upvotes: 1

jmg
jmg

Reputation: 7414

If they are binary files, then it does not make sense to talk about reading them line by line. Anyhow, if you want to process large files sequentially without having them in memory as a whole, you are interested in streaming those files. Have a look at the InputStream and OutputStream types. And avoid all types with 'ByteArray' in their name.

Upvotes: 0

Peter Lawrey
Peter Lawrey

Reputation: 533492

I would use a memory mapped file. This way you have have MBs or GBs "in-memory" without using much heap at all. (a few KB).

If the data is from an external source you can still place them into direct memory or memory mapped files.

Upvotes: 1

Anish Dasappan
Anish Dasappan

Reputation: 415

why are you not using BufferedReader

if you are using Java 7, try nio2 http://download.oracle.com/javase/7/docs/api/java/nio/package-summary.html http://download.oracle.com/javase/7/docs/api/java/nio/file/package-summary.html

Upvotes: 0

Norman
Norman

Reputation: 453

I'd imagine it all depends on how you are processing them. I you can process a single file at a time, do so. If the individual files are too large, read a 100K at a time, and parse for line breaks yourself. Process everything up to the last line break in the file, move the remainder to the beginning of the array and read more data. These are simple techniques, but without knowing more about how you are processing, that's about all I could suggest.

Upvotes: 1

Related Questions