Reputation: 53
The answer to this question on splitting strings by newline characters, Split bash string by newline characters, seems to say that newlines are the default delimiter, so we should change the delimiter to null, and split on that instead. Why doesn't splitting on the newline work? What I would expect (and desire, in my use case) is that there be a 1:1 correlation between lines and \n in the input string (so a \n must be added to get the last line), and that blank lines, leading/imbedded whitespace, etc. would be preserved.
Quoting from Mark Gerolimatos, who seems to be asking the same question:
In OS-X/Macland, you have to use bash 3.2 (or at least without updating BASH). Thus the mysterious read -rd ' ' must be used (and works!) the online manual page I found is pretty cryptic about this (ss64.com/bash/read.html)...it's pretty mind-bending...does it mean "turn off \n, and then use emptiness as the delimiter?"
Upvotes: 0
Views: 1006
Reputation: 183211
Just to make sure we're on the same page, this is the code in that answer:
IFS=$'\n' read -rd '' -a y <<<"$x"
where x
is the variable to read from and y
is the array variable to populate with the lines of x
.
Why doesn't splitting on the newline work?
It does; the IFS=$'\n'
is telling read
to split on newlines.
If you're asking why you can't write read -rd $'\n' -a y
, then: the delimiter indicated by -d
tells read
where to stop reading. So if you set that to a newline, then read
will only read one line!
What I would […] desire […] is that […] blank lines […] would be preserved.
Yes, it's annoying that initial or consecutive occurrences of the separator get discarded, such that x=$'\na\n\nb'
gives the same result as x=$'a\nb'
.
To satisfy your requirements, you'll need to use a slightly different approach, where you call read
once per line:
y=()
while IFS= read -r -d $'\n' ; do
y+=("$REPLY")
done <<< "${x%$'\n'*}"
In this approach, we tell read
to just take the line as-is and not split it (hence IFS=
), and we handle the looping ourselves.
Note that the "${x%$'\n'*}"
bit strips off the last newline and everything after it, per your requirement to ignore the last line if it doesn't have a newline. (The <<<
bit implicitly adds a newline.)
Upvotes: 2
Reputation: 123410
The confusion happens because read
operates with two delimiters:
By default, this is:
If you just set IFS=$'\n'
you can see the problem:
What you instead want to do is
read -d ''
causes read
to read until an ASCII NUL, which is not found in normal text, and is therefore a workable proxy for "read all text input".
Upvotes: 5