Tryer
Tryer

Reputation: 4050

C++ getting console app in windows to print as fast as in linux

This code:

#include <iostream>
#include <chrono>
#include <functional>
#include <time.h>

int main() {
    time_t b4 = time(NULL);
    for (int i = 0; i < 50000; i++)
        std::cout << i << " ";
    std::cout << std::endl;
    time_t a4 = time(NULL);
    std::cout << "Time taken is " << difftime(a4, b4);
    getchar();
}

in Windows when compiled/built/run with Visual Studio with commands:

CL.exe /c /Zi /nologo /W3 /WX- /diagnostics:column /sdl /O2 /Oi /GL /D _MBCS /Gm- /EHsc /MD /GS /Gy /fp:precise /permissive- /Zc:wchar_t /Zc:forScope /Zc:inline /FA /Fa"x64\Release\\" /Fo"x64\Release\\" /Fd"x64\Release\vc142.pdb" /Gd /TP /FC /errorReport:prompt ..\src\console_printf.cpp
         console_printf.cpp
       Link:
link.exe /ERRORREPORT:PROMPT /OUT:"Release\windows.exe" /NOLOGO kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /MANIFEST /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /manifest:embed /DEBUG:FULL /PDB:"Release\windows.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /LTCG:incremental /TLBID:1 /DYNAMICBASE /NXCOMPAT /IMPLIB:"Release\windows.lib" /MACHINE:X64 x64\Release\console_printf.obj

finally prints (after printing ... 49998 49999)

Time taken is 15

The same code when compiled/built/run on Linux with:

g++    -c -O2 -MMD -MP -MF "build/Release/GNU-Linux/_ext/511e4115/console_printf.o.d" -o build/Release/GNU-Linux/_ext/511e4115/console_printf.o ../src/console_printf.cpp
mkdir -p dist/Release/GNU-Linux

finally prints (after printing ... 49998 49999)

Time taken is 1

That is, console/terminal printing in Linux is just much faster. Both tests were with optimizations turned on in release mode. Although tests were done on two separate machines (one running Windows/Visual Studio, the other running Linux), the computing powers of both are comparable.

Is there a way to get Windows console printing as fast as Linux? I run a numerically intensive/iterative code which periodically displays progress on the console and I am now worried that unnecessarily Windows console printing might be messing up with the recorded time for no fault of the algorithm but because Windows console printing is unwittingly the bottleneck.

Upvotes: 1

Views: 492

Answers (2)

Spencer
Spencer

Reputation: 2214

Your standard library implementation may be part of your problem. I ran the following code with plain vanilla Visual C++:

#define WRITE_CONSOLE_API
#define _CRT_SECURE_NO_WARNINGS
#include <iostream>
#include <chrono>
#include <functional>
#include <time.h>
#include <windows.h>



int main() {
    LARGE_INTEGER freq;
    QueryPerformanceFrequency(&freq);
    LARGE_INTEGER start;
    LARGE_INTEGER stop;
    std::ios_base::sync_with_stdio(true);
#ifdef WRITE_CONSOLE_API
    char buf[20];
    static char buf2[2] = { '\r', '\0' };
    std::uninitialized_fill_n(buf, 20, '\0');
    auto con = GetStdHandle(STD_OUTPUT_HANDLE);
    DWORD l;
    DWORD lr;
#endif
    QueryPerformanceCounter(&start);
#ifdef WRITE_CONSOLE_API
    for (int i = 0; i < 50000; i++)
    {
       if (i)
          WriteConsoleA(con, buf2, 1, &lr, NULL);
       _itoa(i, buf, 10);
       l = strlen(buf);
       WriteConsoleA(con, buf, l, &lr, NULL);
    }
    buf2[0] = '\n';
    WriteConsoleA(con, buf2, 1, &lr, NULL);
#else
    for (int i = 0; i < 50000; i++)
        std::cout << '\r' << i;
    std::cout << std::endl;
#endif
    QueryPerformanceCounter(&stop);
    double diff = stop.QuadPart - start.QuadPart;
    std::cout << "Time taken is " << diff / freq.QuadPart << " secs\n";
    std::cin.ignore(1);
}

with WRITE_CONSOLE_API defined (where it used the Windows API call WriteConsole) and also with it not defined (where it used std::cout).

With WRITE_CONSOLE_API defined, the result was

Time taken is 2.12448 secs

With WRITE_CONSOLE_API not defined, the result was

Time taken is 6.25676 secs

if you use a space instead of \r (i.e. to force the console window to scroll), you get

Time taken is 3.02435 secs

with WRITE_CONSOLE_API defined, and

Time taken is 7.27557 secs

with WRITE_CONSOLE_API not defined. Scrolling appears to consistently add 1 second to both times.

You should try this on your own machine, because the timings may vary.

I had debugging on, so NO optimization. With optimization, the standard library version was reduced to 6.8 seconds (Scrolling) and 5.6 seconds (nonscrolling), but the Windows API version didn't change.

If you truly want to separate the program's actual work from the vagaries of the operating system, you could create a thread to do the work, and use the other thread to write progress to the console. You really only need to connect them with the actual progress count, as a std::atomic<some_int_type>).

Upvotes: 3

Thomas Matthews
Thomas Matthews

Reputation: 57698

If you want to improve console I/O (which is the bottleneck for most I/O bound applications) print to a buffer then block write the buffer to a console.

#include <string>
#include <iostream>
#include <sstream>

int main ()
{
    std::string buffer;
    buffer.reserve(5000);
    std::ostringstream number_stream(buffer);
    for (unsigned int i = 0; i < 50000; ++i)
    {
        number_stream << i << " ";
    }
    number_stream << "\n";
    const unsigned int length = buffer.length();
    std::cout.write(buffer.c_str(), length);

    return 0;
}

The above code uses a std::string for its buffer. All the numbers are formatted (human readable) into the buffer. The buffer is then written to the console using a block write.

The idea behind buffer.reserve() is to allocate a large enough buffer to reduce the reallocations.

Upvotes: 2

Related Questions