Reputation: 15406
I am reading some large files in as byte arrays and adding them to an arraylist. Bad... I end up with 100mb files IN MEMORY. I need them to be processed and output line by line (or every x number of lines). How do I do this? BufferedWriter doesn't take ByteArrays.
Upvotes: 0
Views: 1000
Reputation: 54005
Dealing with data "on the fly" like this is called streaming. Data arrives from one stream, is processed, and saved to another stream, and you don't need to fit everything in memory.
If it's directly from database, ResultSet
does not fetch all result rows at once. You can safely iterate over it in a loop and process rows "on the fly".
If it's a file, you can read it with a stream (e.g. FileInputStream
) and prefetch smaller chunks. If it's a text file, use a Reader
for all the automatic buffering and splitting on line breaks.
Wherever you get this data from, you can process it "on the fly" and write to output. Again, don't write to a byte array, but to stream - such as FileOutputStream
, or a Writer
.
Upvotes: 1
Reputation: 7414
If they are binary files, then it does not make sense to talk about reading them line by line. Anyhow, if you want to process large files sequentially without having them in memory as a whole, you are interested in streaming those files. Have a look at the InputStream
and OutputStream
types. And avoid all types with 'ByteArray' in their name.
Upvotes: 0
Reputation: 533492
I would use a memory mapped file. This way you have have MBs or GBs "in-memory" without using much heap at all. (a few KB).
If the data is from an external source you can still place them into direct memory or memory mapped files.
Upvotes: 1
Reputation: 415
why are you not using BufferedReader
if you are using Java 7, try nio2 http://download.oracle.com/javase/7/docs/api/java/nio/package-summary.html http://download.oracle.com/javase/7/docs/api/java/nio/file/package-summary.html
Upvotes: 0
Reputation: 453
I'd imagine it all depends on how you are processing them. I you can process a single file at a time, do so. If the individual files are too large, read a 100K at a time, and parse for line breaks yourself. Process everything up to the last line break in the file, move the remainder to the beginning of the array and read more data. These are simple techniques, but without knowing more about how you are processing, that's about all I could suggest.
Upvotes: 1