Reputation: 15849

C/C++ best way to send a number of bytes to stdout

Profiling my program and the function print is taking a lot of time to perform. How can I send "raw" byte output directly to stdout instead of using fwrite, and making it faster (need to send all 9bytes in the print() at the same time to the stdout) ?

void print(){
    unsigned char temp[9];

    temp[0] = matrix[0][0];
    temp[1] = matrix[0][1];
    temp[2] = matrix[0][2];
    temp[3] = matrix[1][0];
    temp[4] = matrix[1][1];
    temp[5] = matrix[1][2];
    temp[6] = matrix[2][0];
    temp[7] = matrix[2][1];
    temp[8] = matrix[2][2];

    fwrite(temp,1,9,stdout);

}

Matrix is defined globally to be a unsigned char matrix[3][3];

Upvotes: 3

Answers (11)

Daniel Bişar

Reputation: 2763

I am pretty sure you can increase the output performance by increasing the buffer size. So you have less fwrite calls. write might be faster but I am not sure. Just try this:

❯ yes | dd of=/dev/null count=1000000 
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB, 488 MiB) copied, 2.18338 s, 234 MB/s

> yes | dd of=/dev/null count=100000 bs=50KB iflag=fullblock
100000+0 records in
100000+0 records out
5000000000 bytes (5.0 GB, 4.7 GiB) copied, 2.63986 s, 1.9 GB/s

The same applies to your code. Some tests during the last days show that probably good buffer sizes are around 1 << 12 (=4096) and 1<<16 (=65535) bytes.

Upvotes: 0

Anton Stafeyev

Reputation: 831

So first, don't print on every entry. Basically what i am saying is do not do like that.

for(int i = 0; i<100; i++){
    printf("Your stuff");
}

instead allocate a buffer either on stack or on heap, and store you infomration there and then just throw this bufffer into stdout, just liek that

char *buffer = malloc(sizeof(100));
for(int i = 100; i<100; i++){
    char[i] = 1; //your 8 byte value goes here
}

//once you are done print it to a ocnsole with 
write(1, buffer, 100);

but in your case, just use write(1, temp, 9);

Upvotes: 0

Allan Stokes

Reputation: 535

The top rated answer claims that IO is slow.

Here's a quick benchmark with a sufficiently large buffer to take the OS out of the critical performance path, but only if you're willing to receive your output in giant blurps. If latency to first byte is your problem, you need to run in "dribs" mode.

Write 10 million records from a nine byte array

Mint 12 AMD64 on 3GHz CoreDuo under gcc 4.6.1

   340ms   to /dev/null 
   710ms   to 90MB output file 
 15254ms   to 90MB output file in "dribs" mode

FreeBSD 9 AMD64 on 2.4GHz CoreDuo under clang 3.0

   450ms   to /dev/null 
   550ms   to 90MB output file on ZFS triple mirror
  1150ms   to 90MB output file on FFS system drive
 22154ms   to 90MB output file in "dribs" mode

There's nothing slow about IO if you can afford to buffer properly.

#include <stdio.h> 
#include <assert.h> 
#include <stdlib.h>
#include <string.h>

int main (int argc, char* argv[]) 
{
    int dribs = argc > 1 && 0==strcmp (argv[1], "dribs");
    int err;
    int i; 
    enum { BigBuf = 4*1024*1024 };
    char* outbuf = malloc (BigBuf); 
    assert (outbuf != NULL); 
    err = setvbuf (stdout, outbuf, _IOFBF, BigBuf); // full line buffering 
    assert (err == 0);

    enum { ArraySize = 9 };
    char temp[ArraySize]; 
    enum { Count = 10*1000*1000 }; 

    for (i = 0; i < Count; ++i) {
        fwrite (temp, 1, ArraySize, stdout);    
        if (dribs) fflush (stdout); 
    }
    fflush (stdout);  // seems to be needed after setting own buffer
    fclose (stdout);
    if (outbuf) { free (outbuf); outbuf = NULL; }
}

Upvotes: 10

Ketan

Reputation: 1015

As everyone has pointed out IO in tight inner loop is expensive. I have normally ended up doing conditional cout of Matrix based on some criteria when required to debug it.

If your app is console app then try redirecting it to a file, it will be lot faster than doing console refreshes. e.g app.exe > matrixDump.txt

Upvotes: 1

Daniel

Reputation: 374

Try running the program twice. Once with output and once without. You will notice that overall, the one without the io is the fastest. Also, you could fork the process (or create a thread), one writing to a file(stdout), and one doing the operations.

Upvotes: 0

Darron

Reputation: 21628

Perhaps your problem is not that fwrite() is slow, but that it is buffered. Try calling fflush(stdout) after the fwrite().

This all really depends on your definition of slow in this context.

Upvotes: 3

falstro

Reputation: 35667

The rawest form of output you can do is the probable the write system call, like this

write (1, matrix, 9);

1 is the file descriptor for standard out (0 is standard in, and 2 is standard error). Your standard out will only write as fast as the one reading it at the other end (i.e. the terminal, or the program you're pipeing into) which might be rather slow.

I'm not 100% sure, but you could try setting non-blocking IO on fd 1 (using fcntl) and hope the OS will buffer it for you until it can be consumed by the other end. It's been a while, but I think it works like this

fcntl (1, F_SETFL, O_NONBLOCK);

YMMV though. Please correct me if I'm wrong on the syntax, as I said, it's been a while.

Upvotes: 3

vdsf

Reputation: 1618

You can simply:

std::cout << temp;

printf is more C-Style.

Yet, IO operations are costly, so use them wisely.

Upvotes: -1

anon

Reputation:

What's wrong with:

fwrite(matrix,1,9,stdout);

both the one and the two dimensional arrays take up the same memory.

Upvotes: 0

FreeMemory

Reputation: 8614

IO is not an inexpensive operation. It is, in fact, a blocking operation, meaning that the OS can preempt your process when you call write to allow more CPU-bound processes to run, before the IO device you're writing to completes the operation.

The only lower level function you can use (if you're developing on a *nix machine), is to use the raw write function, but even then your performance will not be that much faster than it is now. Simply put: IO is expensive.

Upvotes: 10

jasedit

Reputation: 662

All printing is fairly slow, although iostreams are really slow for printing.

Your best bet would be to use printf, something along the lines of:

printf("%c%c%c%c%c%c%c%c%c\n", matrix[0][0], matrix[0][1], matrix[0][2], matrix[1][0],
  matrix[1][1], matrix[1][2], matrix[2][0], matrix[2][1], matrix[2][2]);

Upvotes: 1

C/C++ best way to send a number of bytes to stdout

Answers (11)

Write 10 million records from a nine byte array

Mint 12 AMD64 on 3GHz CoreDuo under gcc 4.6.1

FreeBSD 9 AMD64 on 2.4GHz CoreDuo under clang 3.0

Related Questions