Why does read() fail with EAGAIN when piping to a program using boost::asio for STDIN/STDOUT?

Question

I have a small program that makes an SSL connection to a server and then copies data from STDIN to the server and data from the server to STDOUT (much like openssl s_client). I'm using boost::asio for reading and writing to STDIN, STDOUT and the SSL socket. The problem is that I cannot pipe data from another program my program e.g.

cat | myprog

I type a line and hit enter, and it all works as it should: the line of text finds its way through my program, to the server which responds, and the response is printed to my console. The next time I send a command, cat sends it but fails on the next read() call (I type the lines beginning with "echo"):

echo foo
foo
echo bar
cat: -bar
: Resource temporarily unavailable

Why is this happening?

strace confirms this, from cat:

read(0, "echo foo
", 32768)            = 9
write(1, "echo foo
", 9)               = 9
read(0, "echo bar
", 32768)            = 9
write(1, "echo bar
", 9)               = 9
read(0, 0xa02c000, 32768)               = -1 EAGAIN (Resource temporarily unavailable)

Theory #1: boost::asio is setting STDIN to nonblocking for it's purposes but it is also affecting STDIN for cat. This shouldn't be a problem if I change my code to fork() off the preprocessor allowing it to inherit STDIN and STDERR, and capture STDOUT which asio can read from directly. That way asio wouldn't have to touch STDIN. This has been done and strace confirms that file descriptor 0 has been left alone.

Theory #2: When my program writes to STDOUT, it does something that changes cats STDIN from blocking to non-blocking. I don't think this is the case:

14211 read(0,  
//myprog (pid 14209) does epoll stuff here
//cat (pid 14211) receives my command
14211 <... read resumed> "echo foo
", 32768) = 9
//more epoll
//cat writes
14211 write(1, "echo foo
", 9)         = 9
14209 <... epoll_wait resumed> {{EPOLLIN, {u32=136519504, u64=136519504}}}, 128, -1) = 1
//cat starts reading again
14211 read(0,  
//my prog receives command from cat
14209 readv(3, [{"echo foo
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512}], 1) = 9
//sends it to the server (encrypted)
14209 sendmsg(7, {msg_name(0)=NULL, msg_iov(1)=[{"\27\3\3\0'T\252\251\317w\255\310}h\322\222%\204\326FA\271\302\241\376\237\7\377\275\250o\262"..., 44}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 44
14209 epoll_wait(5, {{EPOLLIN|EPOLLOUT, {u32=136514856, u64=136514856}}}, 128, 0) = 1
14209 readv(3, 0xbfd8be04, 1)           = -1 EAGAIN (Resource temporarily unavailable)
//receives response
14209 recvmsg(7, {msg_name(0)=NULL, msg_iov(1)=[{"\27\3\3\0"\357\351\3276a\233\356C\326z\317\252\344\27A\f	|\f\307\275u\344\\351 \320"..., 17408}], msg_controllen=0, msg_flags=0}, 0) = 39
//myprog sets non-blocking IO on STDOUT
14209 ioctl(1, FIONBIO, [1])            = 0
//writes out response
14209 writev(1, [{"foo
", 4}], 1)      = 4
//myprog does more epoll stuff again
//cat receives seccond command, not that that this call started before myprog wrote anything or called ioctl()
14211 <... read resumed> "echo bar
", 32768) = 9
14209 <... epoll_wait resumed> {{EPOLLOUT, {u32=136514720, u64=136514720}}}, 128, -1) = 1
14211 write(1, "echo bar
", 9 
14209 epoll_wait(5,  
14211 <... write resumed> )             = 9
14209 <... epoll_wait resumed> {{EPOLLIN, {u32=136519504, u64=136519504}}}, 128, -1) = 1
14211 read(0,  
14209 readv(3,  
//cat's next read fails
14211 <... read resumed> 0x8d6f000, 32768) = -1 EAGAIN (Resource temporarily unavailable)

My program does changes its own STDOUT to non-blocking, but I can see it leaves fd 0 alone. The full trace is available.

Sam Varshavchik · Accepted Answer

The way that most shells and terminals work is that stdin and stdout are read/write file descriptors for the same file. Consider the following:

$ echo FOO >&0
FOO

Congratulations, you've just written something to stdin's file descriptor.

The other half of the file can be found in the fcntl(2) man page, next to the description of the F_SETFL fcntl() call that sets various file status flags:

File status flags

Each open file description has certain associated status flags, ini‐ tialized by open(2) and possibly modified by fcntl(). Duplicated file descriptors (made with dup(2), fcntl(F_DUPFD), fork(2), etc.) refer to the same open file description, and thus share the same file status flags.

So, your theory #1 is mostly correct. If you work out in your head how all the file descriptors get created, your program's standard output and cat's standard input end up being different descriptors for the same file, because of the way the shell sets up each launched program's standard input, output, and error; and setting the non-blocking mode using the file desciptor affects all descriptors for the same underlying file handle.

Note that the above-referenced documentation explicitly references fork(), so you will NOT be able to work around this via forking, which simply duplicates the same file descriptors.

I have two suggestions.

Explicitly open("/dev/tty") to get a completely independent file handle.
Why do you even need to put your program's standard output into non-blocking mode? Since it goes to the terminal, the non-blocking mode doesn't really accomplish anything, there.

Why does read() fail with EAGAIN when piping to a program using boost::asio for STDIN/STDOUT?

Answers (1)

Related Questions