makansij
makansij

Reputation: 9875

When does word-splitting occur in bash?

I used to think that understood bash scripting very well, but something recently has called my understanding into question:

I thought that word splitting works by splitting scanning inputs and then replacing anything in the value for IFS with a space.

I printed my IFS and it is

>>> echo “$IFS” | xxd
00000000: 2009 0a0a

which tells me that it should take convert all tabs, newline characters, and spaces into JUST spaces.

So, decided to test this:

list_of_stuff=("\n")

for elm in "${list_of_stuff[@]}"
do
    echo $elm
done

I would expect that it would convert the \n into \s. But, when I run this get the following output:

>>> sh test_bash_script.sh


>>>

..which tells me that it is not converting the \n into \s as I would expect. When should I expect this conversion to occur?

Upvotes: 1

Views: 233

Answers (1)

Gordon Davisson
Gordon Davisson

Reputation: 125948

Your array doesn't have a newline in it, it has a backslash character followed by an "n". When the shell executes echo $elm, it converts the $elm to '\n', performs word splitting (no whitespace characters found), and passes that to echo as its argument. echo then sees \n, and performs escape interpretation (some versions of echo do this, some don't) which converts it to a newline, and prints that.

Try it with `printf "'%s'\n" $elm to get a better idea what's happening:

$ list_of_stuff=("\n")
$ for elm in "${list_of_stuff[@]}"; do
> printf "'%s'\n" $elm
> done
'\n'
$ list2=($'\n')    # This'll give an actual newline
$ for elm in "${list2[@]}"; do
> printf "'%s'\n" $elm
> done
''

But... why did it print nothing that second time? It's because $elm expanded to a newline, which word splitting turned into 0 words, so it ran the equivalent of printf "'%s'\n", which just prints two single-quotes followed by a newline.

BTW, you can also use set -x to get a better idea what's going on in cases like this. In your original case, it would show that it's executing the equivalent of echo '\n'.

[EDIT] To answer the questions about "turned into 0 words" and equivalent of printf "'%s'\n": word splitting does not turn anything into spaces; it turns a string into a series of words. If echo gets multiple arguments ("words"), it sticks them together with spaces in between, so word splitting + echo can have the effect of turning all whitespace into single spaces, but that's not actually what word splitting itself does. Consider several examples:

$ var1=$' \t word1 \n  \t  word2   \nword3 \n \n '    # Note that $' ' converts escape sequences
$ printf "'%s'\n" "$var1"    # This prints the actual contents with quotes around, no further interpretation
'    word1 
      word2   
word3 


'
$ echo $var1    # No quotes, so it gets word-split; echo pastes together with spaces
word1 word2 word3
$ printargs() {    # Let's define a function to show what's happening more clearly
> echo "Received $# arguments:"
> for arg in "$@"; do
> printf "   '%s'\n" "$arg"
> done
> }
$ printargs $var1
Received 3 arguments:
   'word1'
   'word2'
   'word3'

Let's take a look at the echo and printargs commands in a little more detail. In echo $var1, the value of $var1 gets word-split on whitespace (spaces, tabs, and newlines), which turns it into three words: "word1", "word2", and "word3". There are no spaces here, they've all been removed. So it executes the equivalent of echo "word1" "word2" "word3", echo takes those three arguments, adds spaces between them, and prints the result.

Now, I defined printargs as a function that prints how many arguments it got, followed by each argument (indented and single-quoted). So in printargs $var1 the word-splitting happens the same, so it executes the equivalent of printargs "word1" "word2" "word3", so printargs reports that it got three arguments, and prints each one separately (no spaces, except those I made it add for indentation).

Ok, next series of examples:

$ var2=$' \t \t    \n \t   '    # All whitespace this time
$ printf "'%s'\n" "$var2"
'           
       '
$ echo $var2

$ printargs $var2
Received 0 arguments:

Again, let's look at the last two commands in more detail: In echo $var2, word-splitting finds zero words in the value of $var -- it's all whitespace -- so it passes zero arguments to echo. The command is equivalent to just echo with no arguments at all. So echo just prints a blank line (no space or anything). Similarly, in printf "'%s'\n" $var2, $var2 word-splits to zero words, so printargs gets (and reports getting) zero arguments. Compare the output with these fully equivalent commands:

$ echo

$ printargs
Received 0 arguments:

Upvotes: 2

Related Questions