Why does the Bash read command return without any input?

Question

I have a Bash script foo that accepts a list of words on STDIN. The script should read the words into an array, then ask for user input:

while IFS= read -r; do
  words+=( "$REPLY" )
done

read -r ans -p "Do stuff? [Yn] "
echo "ans: |$ans|"

The problem is, Bash immediate reads an empty string into the ans variable, without waiting for actual user input. That is, I get the following output (with no prompt or delay):

$ cat words.txt | foo
ans: ||

Since everything piped into STDIN has already been consumed by the first read call, why does the second read call return without actually reading anything?

mklement0 · Accepted Answer

Judging by your symptoms, it looks like you've redirected stdin to provide the list of words to the while loop either via an input file (foo < file) or via a pipeline (... | foo).

If so, your second read command won't automatically switch back to reading from the terminal; it is still reading from whatever stdin was redirected to, and if that input has been consumed (which is exactly what your while loop does, as chepner points out in a comment), read reads nothing, and returns with exit code 1 (which is what terminated the while loop to begin with).

If you explicitly want the second read command to get user input from the terminal, use:

read -r -p "Do stuff? [Yn] " ans



Note:


Stdin redirected from a (finite) file (or pipeline or process substitution with finite output) is a finite resource that eventually reports an EOF condition once all input has been consumed:


read translates the EOF condition into exit code 1, causing the while loop to exit:


Specifically, if read cannot read any more characters, it assigns the null string (empty string) to the specified variable(s) (or $REPLY if none were specified), and sets the exit code to 1.

Note: read may set exit code 1 even when it does read characters (and stores them in the specified variable(s) / $REPLY), namely if the input ends without a delimiter; the delimiter is 
 by default, otherwise the delimiter explicitly specified with -d.

Once all input has been consumed, subsequent read commands cannot read anything anymore (the EOF condition persists, and the behavior is as described above).

By contrast, interactive stdin input from a terminal is potentially infinite: additional data is provided by whatever the user types interactively whenever stdin input is requested.


The way to simulate an EOF condition during interactive multiline input (i.e, to terminate an input loop) is to press ^D (Control-D):


When ^D is pressed once at the very start of a line, read returns without reading anything and sets the exit code to 1, just as if EOF had been encountered.


In other words: the way to terminate unbounded interactive input in a loop is to press ^D after having submitted the last line of input.

By contrast, in the interior of an input line, pressing ^D twice is needed to stop reading and set the exit code to 1, but note that the line typed so far is saved to the target variable(s) / $REPLY.^[1]

Since the stdin input stream wasn't actually closed, subsequent read commands work normally and continue to solicit interactive user input.
Caveat: If you press ^D at the shell's prompt (as opposed to while a running program is requesting input), you'll terminate the shell itself.





P.S.:

There is one incidental error in the question:


The second read command must place operand ans (the name of the variable to store the input in) after all options in order to work syntactically: read -r -p "Do stuff? [Yn] " ans




^{[1] As William Pursell points out in a comment on the question: ^D causes the read(2) system call to return with whatever is in the buffer at that point; the direct value returned is the count of characters read.

A count of 0 is how the EOF condition is signaled, and Bash's read translates that into exit code 1, causing termination of the loop.

Thus, pressing ^D at the start of a line, when the input buffer is empty, exits the loop immediately.

By contrast, if characters have already been typed on the line, then the first ^D causes read(2) to return however many characters were typed so far, upon which Bash's read reinvokes read(2), because the delimiter (a newline by default) hasn't been encountered yet.

An immediately following second ^D then causes read(2) to return 0, since no characters were typed, causing Bash's read to set exit code 1 and exit the loop.}

Why does the Bash read command return without any input?

Answers (1)

Related Questions