Reputation: 10882
In bash, I would like to transform a PATH-like environment variable that may contain space-separated elements into an array
, making sure elements bearing spaces do not cause word-splitting, appearing as "multiple elements".
Let PATH_VARIABLE
be the variable in question.
Let un:dodecaedro:per:tirare:per:i danni
be the content of the variable.
It is intended for the desired array
_to have 6
elements, not 7
.
0) un
1) dodecaedro
2) per
3) tirare
4) per
5) i danni
The "tricky" entry may be the space-separated value: i danni
.
I am looking for the absolute most elegant and correct way to achieve this.
Limitation: it must work with my bash version: v3.2.48(1)-release
In python this is done just beautifully as so:
>>> v='un:dodecaedro:per:tirare:per:i danni'
>>> len(v.split(':'))
6
Works. Shows what I am looking for.
What's the best way to do this in our beloved bash?
Can you specifically improve on my attempt 4
?
#!/bin/bash
PATH_VARIABLE='un:dodecaedro:per:tirare:per:i danni'
# WRONG
a1=($(echo $PATH_VARIABLE | tr ':' '\n'))
# WRONG
a2=($(
while read path_component; do
echo "$path_component"
done < <(echo "$PATH_VARIABLE" | tr ':' '\n')
))
# WORKS, it is elegant.. but I have no bash 4!
# readarray -t a3 < <(echo "$PATH_VARIABLE" | tr ':' '\n')
# WORKS, but it looks "clunky" to me :(
i=0
while read line; do
a4[i++]=$line
done < <(echo "$PATH_VARIABLE" | tr ':' '\n')
n=${#a4[@]}
for ((i=0; i < n; i++)); do
printf '%2d) %s\n' "$i" "${a4[i]}"
done
bash v3.2.48(1)-release
osx OS X v10.8.3 (build 12D78)
Upvotes: 14
Views: 21249
Reputation: 77107
f() {
local IFS=:
local foo
set -f # Disable glob expansion
foo=( $@ ) # Deliberately unquoted
set +f
printf '%d\n' "${#foo[@]}"
printf '%s\n' "${foo[@]}"
}
f 'un:dodecaedro:per:tirare:per:i danni'
6
un
dodecaedro
per
tirare
per
i danni
Modifying Jim McNamara's answer, you could just reset IFS:
oIFS="$IFS"
foo='un:dodecaedro:per:tirare:per:i danni'
IFS=: arr=( $foo )
IFS="$oIFS"
I prefer the function scope because it protects IFS changes from bleeding into the global scope without requiring special care to reset it.
As a matter of clarification: In the second example, the IFS setting does change the global variable. The salient difference between this:
IFS=: arr=( $foo )
and this:
IFS=: read -a arr <<< "$foo"
is that the former is two variable assignments and no commands, and the latter is a simple command (see simple command in man (1) bash
.)
$ echo "$BASH_VERSION"
3.2.48(1)-release
$ echo "$IFS"
$ foo='un:dodecaedro:per:tirare:per:i danni'
$ IFS=: read -a arr <<< "$foo"
$ echo "${#arr[@]}"
6
$ echo "$IFS"
$ IFS=: arr1=( $foo )
$ echo "${#arr1[@]}"
6
$ echo "$IFS"
:
Upvotes: 9
Reputation: 6577
# Right. Add -d '' if PATH members may contain newlines.
IFS=: read -ra myPath <<<"$PATH"
# Wrong!
IFS=: myPath=($PATH)
# Wrong!
IFS=:
for x in $PATH; do ...
# How to do it wrong right...
# Works around some but not all word split problems
# For portability, some extra wrappers are needed and it's even harder.
function stupidSplit {
if [[ -z $3 ]]; then
return 1
elif [[ $- != *f* ]]; then
trap 'trap RETURN; set +f' RETURN
set -f
fi
IFS=$3 command eval "${1}=(\$${2})"
}
function main {
typeset -a myPath
if ! stupidSplit myPath PATH :; then
echo "Don't pass stupid stuff to stupidSplit" >&2
return 1
fi
}
main
Rule #1: Don't cram a compound data structure into a string or stream unless there's no alternative. PATH
is one case where you have to deal with it.
Rule #2: Avoid word / field splitting at all costs. There are almost no legitimate reasons to apply word splitting on the value of a parameter in non-minimalist shells such as Bash. Almost all beginner pitfalls can be avoided by just never word splitting with IFS. Always quote.
Upvotes: 7
Reputation: 16379
Consider:
$ foo='1:2 3:4 5:6'
$ IFS=':'; arr=($foo)
$ echo "${arr[0]}"
1
$ echo "${arr[1]}"
2 3
$ echo "${arr[2]}"
4 5
$ echo "${arr[3]}"
6
Oh well - took me too long to format an answer... +1 @kojiro.
Upvotes: 6