Reputation: 363
I am trying to split the string using the delimiter '|'. But, I want to get '|' from my sample data in the second example. How can I achieve this?
f() {
local IFS='|'
local foo
set -f # Disable glob expansion
foo=( $@ ) # Deliberately unquoted
set +f
printf '%d\n' "${#foo[@]}"
printf '%s\n' "${foo[@]}"
}
f 'un|dodecaedro|per|||tirare|per|i danni'
Expected output:
un
dodecaedro
per
|
tirare
per
i danni
Upvotes: 0
Views: 111
Reputation: 246744
Let's see how a "real" CSV parser handles that data:
echo 'un|dodecaedro|per|||tirare|per|i danni' |
ruby -rcsv -ne 'puts CSV.parse_line($_, :col_sep=>"|").join("\n")'
un
dodecaedro
per
tirare
per
i danni
What if we were to use quotes on the "troublesome" field:
echo 'un|dodecaedro|per|"|"|tirare|per|i danni' |
ruby -rcsv -ne 'puts CSV.parse_line($_, :col_sep=>"|").join("\n")'
un
dodecaedro
per
|
tirare
per
i danni
So, you have to ensure your data is in a clean state first.
Upvotes: 0
Reputation: 16997
There may be some good way to produce what you expected, here is my approach, I hope you are using recent version of bash and here string is supported
string='un|dodecaedro|per|||tirare|per|i danni'
awk '{
n=split($0,A,"|")
for(i=1;i<=n;i++)
{
if(length(A[i]) == 0 && length(A[i+1])==0)
{
print "|"; i+=1
}
else
{
print A[i]
}
}
}' <<<"$string"
Resulting
$ bash f
un
dodecaedro
per
|
tirare
per
i danni
Upvotes: 1
Reputation: 437062
You can try to use sed
to simply replace two consecutive newlines (that result from set
's parsing and output with \n
as the separator) with \n|
after the fact (but @Ed Morton's caveat re ambiguity applies):
f() {
local IFS='|'
local foo
set -f # Disable glob expansion
foo=( $@ ) # Deliberately unquoted
set +f
printf '%d\n' "${#foo[@]}"
printf '%s\n' "${foo[@]}" | sed ':a; N; $!ba; s/\n\n/\n|/g'
}
f 'un|dodecaedro|per|||tirare|per|i danni'
Upvotes: 0
Reputation: 784898
Crude way to get this output from awk:
s='un|dodecaedro|per|||tirare|per|i danni'
awk '{p=$0; while ((n=index(p, "|"))) { m=(n==1)?n:n-1; print substr(p, 1, m);
p=substr(p, n+1)}; print p }' <<< "$s"
un
dodecaedro
per
|
|
tirare
per
i danni
Upvotes: 1