Reputation: 670
Ok, here's something that I cannot wrap my head around. I bumped into this while working on a rather complex script. Managed to simplify this to the bare minimum, but it still doesn't make sense.
Let's say, I have a fifo
:
mkfifo foo.fifo
Running the command below on one terminal, and then writing things into the pipe (echo "abc" > foo.fifo
) on another seems to work fine:
while true; do read LINE <foo.fifo; echo "LINE=$LINE"; done
LINE=abc
However, changing the command ever so slightly, and the read
command fails to wait for the next line after it's read the first one:
cat a.fifo | while true; do read LINE; echo "LINE=$LINE"; done
LINE=abc
LINE=
LINE=
LINE=
[...] # At this keeps repeating endlessly
The really disturbing part is, that it'll wait for the first line, but then it just reads an empty string into $LINE
, and fails to block. (Funny enough, this is one of the few times, I want an I/O-operation to block :))
I thought, I really understand how I/O-redirection and such things work, but now I am rather confused.
So, what's the solution, what am I missing? Can anyone explain this phenomenon?
UPDATE: For a short answer, and a quick solution see William's answer. For a more in-depth, and complete insight, you'd want to go with rici's explanation!
Upvotes: 1
Views: 1010
Reputation: 241951
Really, the two command lines in the question are very similar, if we eliminate the UUOC:
while true; do read LINE <foo.fifo; echo "LINE=$LINE"; done
and
while true; do read LINE; echo "LINE=$LINE"; done <foo.fifo
They act in slightly different ways, but the important point is that neither of them correct.
The first one opens and reads from the fifo and then closes the fifo every time through the loop. The second one opens the fifo, and then attempts to read from it every time through the loop.
A fifo is a slightly complicated state machine, and it's important to understand the various transitions.
Opening a fifo for reading or writing will block until some process has it open in the other direction. That makes it possible to start a reader and a writer independently; the open
calls will return at the same time.
A read from a fifo succeeds if there is data in the fifo buffer. It blocks if there is no data in the fifo buffer but there is at least one writer which holds the fifo open. If returns EOF if there is no data in the fifo buffer and no writer.
A write to a fifo succeeds if there is space in the fifo buffer and there is at least one reader which has the fifo open. It blocks if there is no space in the fifo buffer, but at least one reader has the fifo open. And it triggers SIGPIPE (and then fails with EPIPE if that signal is being ignored) if there is no reader.
Once both ends of the fifo are closed, any data left in the fifo buffer is discarded.
Now, based on that, let's consider the first scenario, where the fifo is redirected to the read
. We have two processes:
reader writer
-------------- --------------
1. OPEN blocks
2. OPEN succeeds OPEN succeeds immediately
3. READ blocks
4. WRITE
5. READ succeeds
6. CLOSE ///////// CLOSE
(The writer could equally well have started first, in which case it would block at line 1 instead of the reader. But the result is the same. The CLOSE operations at line 6 are not synchronized. See below.)
At line 6, the fifo no longer has readers nor writers, so its buffer is flushed. Consequently, if the writer had written two lines instead of one, the second line would be tossed into the bit bucket, before the loop continues.
Let's contrast that with the second scenario, in which the reader is the while loop and not just the read:
reader writer
--------- ---------
1. OPEN blocks
2. OPEN succeeds OPEN succeeds immediately
3. READ blocks
4. WRITE
5. READ succeeds
6. CLOSE
--loop--
7. READ returns EOF
8. READ returns EOF
... and again
42. and again OPEN succeeds immediately
43. and again WRITE
44. READ succeeds
Here, the reader will continue to read lines until it runs out. If no writer has appeared by then, the reader will start getting EOFs. If it ignores them (eg. while true; do read...
), then it will get a lot of them, as indicated.
Finally, let's return for a moment to the first scenario, and consider the possibilities when both processes loop. In the description above, I assumed that both CLOSE operations would succeed before either OPEN operation was attempted. That would be the common case, but nothing guarantees it. Suppose instead that the writer succeeds in doing both a CLOSE and an OPEN before the reader manages to do its CLOSE. Now we have the sequence:
reader writer
-------------- --------------
1. OPEN blocks
2. OPEN succeeds OPEN succeeds immediately
3. READ blocks
4. WRITE
5. CLOSE
5. READ succeeds OPEN
6. CLOSE
7. WRITE !! SIGPIPE !!
In short, the first invocation will skip lines, and has a race condition in which the writer will occasionally receive a spurious error. The second invocation will read everything written, and the writer will be safe, but the reader will continuously receive EOF indications instead of blocking until data is available.
So what is the correct solution?
Aside from the race condition, the optimal strategy for the reader is to read until EOF, and then close and reopen the fifo. The second open will block if there is no writer. That can be achieved with a nested loop:
while :; do
while read line; do
echo "LINE=$line"
done < fifo
done
Unfortunately, the race condition which generates SIGPIPE is still possible, although it is going to be extremely rare [See note 1]. All the same, a writer would have to be prepared for its write to fail.
A simpler and more robust solution is available on Linux, because Linux allows fifos to be opened for reading and writing. Such an open always succeeds immediately. And since there is always a process which holds the fifo open for writing, the reads will block, as expected:
while read line; do
echo "LINE=$line"
done <> fifo
(Note that in bash, the "redirect both ways" operator <>
still only redirects stdin -- or fd n form n<>
-- so the above does not mean "redirect stdin and stdout to fifo".)
The fact that a race condition is extremely rare is not a reason to ignore it. Murphy's law states that it will happen at the most critical moment; for example, when the correct functioning was necessary in order to create a backup just before a critical file was corrupted. But in order to trigger the race condition, the writer process needs to arrange for its actions to happen in some extremely tight time bands:
reader writer
-------------- --------------
fifo is open fifo is open
1. READ blocks
2. CLOSE
3. READ returns EOF
4. OPEN
5. CLOSE
6. WRITE !! SIGPIPE !!
7. OPEN
In other words, the writer needs to perform its OPEN in the brief interval between the moment the reader receives an EOF and responds by closing the fifo. (That's the only way the writer's OPEN won't block.) And then it needs to do the write in the (different) brief interval between the moment that the reader closes the fifo, and the subsequent reopen. (The reopen wouldn't block because now the writer has the fifo open.)
That's one of those once in a hundred million race conditions that, as I said, only pops up at the most inopportune moment, possibly years after the code was written. But that doesn't mean you can ignore it. Make sure that the writer is prepared to handle SIGPIPE and retry a write which fails with EPIPE.
Upvotes: 4
Reputation: 212674
When you do
cat a.fifo | while true; do read LINE; echo "LINE=$LINE"; done
which, incidentally, ought to be written:
while true; do read LINE; echo "LINE=$LINE"; done < a.fifo
that script will block until someone opens the fifo for writing. As soon as that happens, the while loop begins. If the writer (the 'echo foo > a.fifo' you ran in another shell) terminates and there is no one else with the pipe open for writing, then the read returns because the pipe is empty and there are no processes that have the other end open. Try this:
in one shell:
while true; do date; read LINE; echo "LINE=$LINE"; done < a.fifo
in a second shell:
cat > a.fifo
in a third shell
echo hello > a.fifo
echo world > a.fifo
By keeping the cat running in the second shell, the read
in the while loop blocks instead of returning.
I guess the key insight is that when you do the redirection inside the loop, the shell does not start the read until someone opens the pipe for writing. When you do the redirection to the while loop, the shell only blocks before it starts the loop.
Upvotes: 2