Alexis King
Alexis King

Reputation: 43902

Reading input character-by-character appears to be skipping newlines

I wanted to write a simple function to find all the newlines in an input string in Bash, and this is what I came up with:

find_newlines() {
  while IFS= read -r -n 1 c; do
    if [[ "$c" == $'\n' ]]; then
      echo 'FOUND NEWLINE'
    fi
  done
}

I thought I did everything I needed to. I set IFS to nothing, I passed the -r flag, and I forced read to read a single character at a time with -n 1. However, to my dismay, the following did nothing at all:

test_data="this is a
test"

printf "$test_data" | find_newlines

I received no output whatsoever. I'm testing under Mac OS X, and I ran this on both Bash version 3.2.57 (the one Apple provides) and 4.3.33 (installed via Homebrew). Both gave the same result.

What do I need to do in order to include newlines when I loop?

Upvotes: 1

Views: 106

Answers (3)

Etan Reisner
Etan Reisner

Reputation: 81032

The problem here is that read even in -n 1 mode is still reading delimited lines. So when it sees the newline it still considers that a line delimiter and removes it (and leaves you with an empty variable).

$ find_newlines() {
    while IFS= read -r -n 1 c; do
        declare -p c;
    done
}
$ printf 'a\nb' | find_newlines
declare -- c="a"
declare -- c=""
declare -- c="b"

As indicated in the answer by Cyrus with bash 4.1+(ref) you can use the -N flag to read to avoid this problem.

The -N option is documented as (from the Bash Reference Manual entry for read):

-N nchars

read returns after reading exactly nchars characters rather than waiting for a complete line of input, unless EOF is encountered or read times out. Delimiter characters encountered in the input are not treated specially and do not cause read to return until nchars characters are read.

For non-4.1+ versions of bash (2.04+ it looks like) you can use the -d flag to read to specify an alternate delimiter to work around this problem. Any delimiter that isn't going to exist in your input will work. The most likely value for that for many input streams is likely to be the NUL/\0 character which you can specify to read as -d ''.

$ find_newlines() {
    while IFS= read -d '' -r -n 1 c; do
        declare -p c;
    done
}
$ printf 'a\nb' | find_newlines
declare -- c="a"
declare -- c="
"
declare -- c="b"

Upvotes: 2

Zohar81
Zohar81

Reputation: 5128

maybe you can use sed to replace the newline char with some unique symbol that you can later trace. this will work disregarding bash version.

enter code here

sed ':a;N;$!ba;s/\n/@/g' <<< "bla

bla

bla"

bla@bla@bla

Upvotes: 0

Cyrus
Cyrus

Reputation: 88899

Replace read's option -n by -N.

See: https://unix.stackexchange.com/a/27424/74329

Upvotes: 3

Related Questions