Confused about buffered and unbuffered stdout/stderr in C and Python

Question

I am confused about a couple of things when it comes to the issue of stdout and stderr being buffered/unbuffered:

1)

Is the statement "stdout/err is buffered/unbuffered" decided by my Operating System or the programming language library functions (particular the write() or print() functions) that I am working with ?

While programming in C, I have always gone by the rule that stdout is buffered while stderr is unbuffered. I have seen this in action by calling sleep() after putchar() statements within a while loop to see the individual characters being placed on stderr one by one, while only complete lines appeared in stdout. When I tried to replicate this program in python, both stderr and stdout had the same behaviour: produced complete lines - so I looked this up and found a post that said:

sys.stderr is line-buffered by default since Python 3.9.

Hence the question - because I was under the impression that the behaviour of stderr being buffered/unbuffered was decided and fixed by the OS but apparently, code libraries free to implement their own behaviour ? Can I hypothetically write a routine that writes to stdout without a buffer ?

The relevant code snippets for reference:

/* C */
while ((c = fgetc(file)) != EOF) {
    fputc(c, stdout /* or stderr */);
    usleep(800);
 }

# Python
for line in file:
    for ch in line:
        print(ch, end='', file=sys.stdout) # or sys.stderr
        time.sleep(0.08);

2)

Secondly, my understanding of the need for buffering is that: since disk access is slower than RAM access, writing individual bytes would be inefficient and thus bytes are written in blocks. But is writing to a device file like /dev/stdout and /dev/stdin the same as writing to disk? (Isn't disk supposed to be permanent? Stuff written to stdout or stderr only appears in the terminal, if connected, and then lost right?)

3)

Finally, is there really a need for stderr to be unbuffered in C if it is less efficient?

Kaia · Accepted Answer

https://linux.die.net/man/3/stderr, https://linux.die.net/man/3/setbuf, and https://linux.die.net/man/2/write are helpful resources here

If you use the raw syscall write, there won't be buffering. I'd imagine the same is true for WinAPI but I don't know.
Python and C want to make it easier to write things, so they wrap the raw syscalls with a file pointer (in C)/file object (in python). This, in addition to storing the raw file descriptor used to make the syscalls, can optionally do things like buffer to reduce the amount of syscalls you're making.
You can change the buffering settings of a file or stream. (In C that's setbuf, I'm not sure for python.)
C and Python just happen to have different default configurations of stderr's wrapper.

For 2), writing to a pipe is usually much faster than writing to disk, but it's still a relatively slow operation compared to memcpy or the like, which is what buffering essentially is. The processor has to jump into kernel mode and back.

For 3), I'd guess that C developers decided it was more important to get errors on-time than to get performance. In general, if your program is spitting out lots of data to stderr you have bigger problems than performance.

Confused about buffered and unbuffered stdout/stderr in C and Python

1)

2)

3)

Answers (2)

Related Questions