Reputation: 3858
I am using:
PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter("test.txt"),1024*1024*500))
to write a large file (approx. 2GB). It takes 26 seconds to write. But, when I replace 500 with 10/20, it takes 19 seconds.
From here, what I understood is buffering gives better performance. If so, then why is this happening? I checked it by running 5 times each, so system/IO load is not an issue.
Upvotes: 4
Views: 1601
Reputation: 533492
As I said in a previous question, there is an optimal buffer size (which is typically around 32 KB) and as you make the buffer larger than this it is slower not faster. The default buffer size is 8 KB.
BTW: How large is your L2/L3 CPU cache? (about 10 MB I suspect) Your primary L1 cache is about 32 KB?
By using a buffer which fits into the fastest cache, you are using the fastest memory. By using a buffer which only fits in main memory, you are using the slowest memory (as much as 10x slower)
In answer to your question.
What I do is assume ISO-8859-1
encoding i.e (byte) ch
and write a byte at a time to a ByteBuffer, possibly memory mapped.
I have methods for writing/reading long
and double
from a ByteBuffer without creating any garbage.
Using this approach you can log about 5 million lines per second to disk.
Upvotes: 3
Reputation: 718758
Buffering your I/O improves performance up to a point by reducing the number of system calls made. But system calls are not that expensive (maybe a millisecond or so), and an overly large buffer could cause problems in other areas. For example:
A 500 Mbyte buffer uses a lot of memory, and potentially increases GC overheads, or increases the system's paging load.
If you write 500 Mbytes in a single write call, the write could saturate the system's buffer cache and overwhelm its ability to overlap disc writes with doing other things at the application level.
Just try using a (significantly) smaller buffer. (I personally wouldn't use a buffer bigger that 8kb without doing some application-specific tuning.)
Upvotes: 1
Reputation: 38676
First you really don't need a buffer that big. Generally 64K or even as low as 8K will be sufficient to get descent IO performance. Any larger and you're just wasting memory and cpu because as you get the buffer bigger and bigger it spends more time at the IO layer writing a big chunk of data. So it's a trade off (min-max if you understand calculus) between waiting for IO and just writing to memory. You can't shove huge buffers to the IO device because it has an internal fixed size buffer. The point is to try and match it as best as possible, but realizing it's relatively impossible to do that because you don't know what other processing are doing. The best thing to do is try something low 8K-16K, run it, measure it. Double the buffer 32K, etc, run it, measure it. If you get a speed improvement do it again. Once you stop getting speed improvements divide by 2, and stop.
So if you wrote 2GB of data in 26s that's a throughput of 76MB/S or 650Mbit/s. You could probably improve it just by lowering the buffer size to something reasonable.
Upvotes: 1
Reputation: 5069
Having an overly large buffer decreases performance. Stick to around 32-64 kb IMO
Upvotes: 2
Reputation: 68847
Very large buffers (500 MB) are also not good, because it will be for the OS more difficult to do the memory management for that huge byte buffer.
Compare it with moving a table in your house instead of moving a box. But if your boxes become too small, you will have to go and come many times.
Don't forget that allocating memory is a O(n)
operation.
Upvotes: 1
Reputation: 11638
1024*1024*500
is 500 megabytes, give or take a smidgen. You're basically forcing the JVM to allocate a 500mb block of contiguous memory, which the JVM is probably having to do a GC cycle to do.
Upvotes: 1