Reputation: 4121
TLDR: In Solaris, if O_NDELAY
is set on stdin
by a child process, bash
exits. Why?
The following code causes interactive bash
(v4.3.33) or tcsh
(6.19.00) shells to exit after the process finishes running:
#include <fcntl.h>
int main() {
fcntl( 0, F_SETFL, O_NDELAY );
//int x = fcntl( 0, F_GETFL );
//fcntl( 0, F_SETFL, ~(x ^ (~O_NDELAY)) );
return 0;
}
The versions of ksh
, csh
and zsh
we have aren't affected by this problem.
To investigate I ran bash
& csh
under truss
(similar to strace
on Linux) like this:
$ truss -eaf -o bash.txt -u'*' -{v,r,w}all bash --noprofile --norc
$ truss -eaf -o csh.txt -u'*' -{v,r,w}all csh -f
After csh
finishes running the process it does the following:
fcntl( 0, F_GETFL ) = FWRITE|FNDELAY
fcntl( 0, F_SETFL, FWRITE) = 0
... which gave me an idea. I changed the program to the commented out code above so it would toggle the state of O_NDELAY
. If I run it twice in a row bash doesn't exit.
Upvotes: 0
Views: 197
Reputation: 4121
This answer got me started on the right path. The man page for read
(in Solaris) says:
When attempting to read a file associated with a terminal that has no data currently available:
* If O_NDELAY is set, read() returns 0
* If O_NONBLOCK is set, read() returns -1 and sets errno to EAGAIN
... so when bash tries to read stdin
it returns 0 causing it to assume EOF was hit.
This page indicates O_NDELAY
shouldn't be used anymore, instead recommending O_NONBLOCK
. I've found similar statements regarding O_NDELAY
/ FIONBIO
for various flavors of UNIX.
As an aside, in Linux O_NDELAY == FNDELAY == O_NONBLOCK
, so it's not terribly surprising I was unable to reproduce this problem in that environment.
Unfortunately, the tool that's doing this isn't one I have the source code for, though from my experimenting I've found ways to work around the problem.
If nothing else I can make a simple program that removes O_NDELAY
as above then wrap execution of this tool in a shell script that always runs the "fixer" program after the other one.
Upvotes: 0