MathematicalOrchid
MathematicalOrchid

Reputation: 62808

Why is this C++ program slower on Windows than Linux?

Consider the following program:

#define _FILE_OFFSET_BITS 64   // Allow large files.
#define REVISION "POSIX Revision #9"

#include <iostream>
#include <cstdio>
#include <ctime>

const int block_size = 1024 * 1024;
const char block[block_size] = {};

int main()
{
    std::cout << REVISION << std::endl;  

    std::time_t t0 = time(NULL);

    std::cout << "Open: 'BigFile.bin'" << std::endl;
    FILE * file;
    file = fopen("BigFile.bin", "wb");
    if (file != NULL)
    {
        std::cout << "Opened. Writing..." << std::endl;
        for (int n=0; n<4096; n++)
        {
            size_t written = fwrite(block, 1, block_size, file);
            if (written != block_size)
            {
                std::cout << "Write error." << std::endl;
                return 1;
            }
        }
        fclose(file);
        std::cout << "Success." << std::endl;

        time_t t1 = time(NULL);
        if (t0 == ((time_t)-1) || t1 == ((time_t)-1))
        {
            std::cout << "Clock error." << std::endl;
            return 2;
        }

        double ticks = (double)(t1 - t0);
        std::cout << "Seconds: " << ticks << std::endl;

        file = fopen("BigFile.log", "w");
        fprintf(file, REVISION);
        fprintf(file, "   Seconds: %f\n", ticks);
        fclose(file);

        return 0;
    }

    std::cout << "Something went wrong." << std::endl;
    return 1;
}

It simply writes 4GB of zeros to a file on disk and times how long it took.

Under Linux, this takes 148 seconds on average. Under Windows, on the same PC, it takes on average 247 seconds.

What the hell am I doing wrong?!

The code is compiled under GCC for Linux, and Visual Studio for Windows, but I cannot imagine a universe in which the compiler used should make any measurable difference to a pure I/O benchmark. The filesystem used in all cases is NTFS.

I just don't understand why such a vast performance difference exists. I don't know why Windows is running so slow. How do I force Windows to run at the full speed that the disk is clearly capable of?

(The numbers above are for OpenSUSE 13.1 32-bit and Windows XP 32-bit on an old Dell laptop. But I've observed similar speed differences on several PCs around the office, running various versions of Windows.)

Edit: The executable and the file it writes both reside on an external USB harddisk which is formatted as NTFS and is nearly completely empty. Fragmentation is almost certainly not a problem. It could be some kind of driver issue, but I've seen the same performance difference on several other systems running different versions of Windows. There is no antivirus installed.

Just for giggles, I tried changing it to use the Win32 API directly. (Obviously this only works for Windows.) Time becomes a little more erratic, but still within a few percent of what it was before. Unless I specify FILE_FLAG_WRITE_THROUGH; then it goes significantly slower. A few other flags make it slower, but I can't find the one that makes it go faster...

Upvotes: 7

Views: 1230

Answers (3)

davmac
davmac

Reputation: 20631

You need to sync file contents to disk, otherwise you are just measuring the level of caching being performed by the operating system.

Call fsync before you close the file.

If you don't do this, the majority of execution time is most likely spent waiting for cache to be flushed so that new data can be stored in it, but certainly a portion of the data you write will not be written out to disk by the time you close the file. The difference in execution times, then, is probably due to linux caching more of the writes before it runs out of available cache space. By contrast, if you call fsync before closing the file, all the written data should be flushed to disk before your time measurement takes place.

I suspect if you add an fsync call, the execution time on the two systems won't differ by so much.

Upvotes: 3

Puppy
Puppy

Reputation: 146910

There are special optimizations for pages which are all zeros. You should fill the page with random data before writing it out.

Upvotes: 0

skyking
skyking

Reputation: 14390

Your test is not very good way to measure performance as there's places where different optimizations in different OS'es and libraries can make a huge difference (the compiler itself don't have to make a big difference).

First we can consider the fwrite (or anything that operates on FILE*) is a library layer above the OS-layer. There can be different buffering strategies that make a difference. For example one smart way of implementing fwrite would be to flush the buffers and then send the data block straight to the OS instead of go through the buffer layer. This can result in a huge advantage at the next step

Second we have the OS/kernel that can handle the write differently. One smart optimization would be to copy pages by just aliasing them and then use copy-on-write if changed in one of the aliases. Linux already does (almost) this when allocating memory to the process (including the BSS section where the array is) - it just marks the page as being zeros and can keep a single such page for all those pages and then creating a new page whenever somebody changes in a zero page. Doing this trick again means that the kernel could just alias a such page in the disk buffer. This means that the kernel would not run low on disk cache when writing such blocks of zeroes since it will only take up 4KiB of actual memory (except for page tables). This strategy is also possible if there's actual data in the data block.

This means that the writes could complete very quickly without any data actually needs to be transferred to the disk (before fwrite completes), even without the data even have to be copied from one place to another in memory.

So you use different libraries and different OS'es and it's not surprising that they perform different task in different time.

Upvotes: 0

Related Questions