How can be that multiple writes on file are faster than a single one

Question

First piece of code:

// code is a private "global variable" for the class
// SourceCodeBuilder is a class that uses StringBuilder()
// basically it is based on String(s), formatted and with many appends depending on the "loc()" calls (see below)
private SourceCodeBuilder code = new SourceCodeBuilder();

[...]

    // create "file.txt" and call algorithm
    fileOut = new FileWriter("file.txt");

    for (int i=0; i



Where algorithm() is a method like this:

private void algorithm () {
    for (int i=0; i


When I do all of this on large text files it takes about 4500ms to execute and less than 60MB of memory.

Then I tried to use this other code.
Second piece of code:

private SourceCodeBuilder code = new SourceCodeBuilder();

[...]

    // create "file.txt" and call algorithm
    fileOut = new FileWriter("file.txt");

    for (int i=0; i


Where this time algorithm() is a method like this:

private void algorithm () {
    for (int i=0; i


It takes more than 250MB of memory (and it's OK because I don't call the "free()" method on the code variable, so it's a "continuos" append on the same variable), but surprisingly it takes more than 5300ms to execute.
That's about 16% slower than the first code, and I can't explain to myself why.

In the first code I write small pieces of text multiple times on "file.txt". In the second code I write a big piece of text, but only one time, on "file.txt", and using more memory. With the second code I was expecting more memory consumption, but not even more CPU consumption (just because there are more I/O operations).

Conclusion: the first piece of code is faster than the second one, even if the first one does more I/O operations than the second one. Why? Am I missing something?

Sergey Kalinichenko · Accepted Answer

When you are slowly filling a large memory buffer, the time required for that grows non-linearly, because you need to re-allocate the buffer multiple times, each time copying the entire content to a new location in memory. This takes time, especially when the buffer is 200MB+. If you preallocate the buffer, your process may go faster.

However, all the above is just my guess. You should profile your application to see where the additional time really goes.

How can be that multiple writes on file are faster than a single one

Answers (2)

Related Questions