doubledecker
doubledecker

Reputation: 363

Split string using delimiter

I am trying to split the string using the delimiter '|'. But, I want to get '|' from my sample data in the second example. How can I achieve this?

f() {
  local IFS='|'
  local foo
  set -f # Disable glob expansion
  foo=( $@ ) # Deliberately unquoted
  set +f
  printf '%d\n' "${#foo[@]}"
  printf '%s\n' "${foo[@]}"
}

f 'un|dodecaedro|per|||tirare|per|i danni'

Expected output:

un
dodecaedro
per
|
tirare
per
i danni 

Upvotes: 0

Views: 111

Answers (4)

glenn jackman
glenn jackman

Reputation: 246744

Let's see how a "real" CSV parser handles that data:

echo 'un|dodecaedro|per|||tirare|per|i danni' | 
ruby -rcsv -ne 'puts CSV.parse_line($_, :col_sep=>"|").join("\n")'
un
dodecaedro
per


tirare
per
i danni

What if we were to use quotes on the "troublesome" field:

echo 'un|dodecaedro|per|"|"|tirare|per|i danni' |
ruby -rcsv -ne 'puts CSV.parse_line($_, :col_sep=>"|").join("\n")'
un
dodecaedro
per
|
tirare
per
i danni

So, you have to ensure your data is in a clean state first.

Upvotes: 0

Akshay Hegde
Akshay Hegde

Reputation: 16997

There may be some good way to produce what you expected, here is my approach, I hope you are using recent version of bash and here string is supported

string='un|dodecaedro|per|||tirare|per|i danni'

awk '{
    n=split($0,A,"|")
    for(i=1;i<=n;i++)
    {
        if(length(A[i]) == 0 && length(A[i+1])==0)
        {
            print "|"; i+=1
        }
        else
        {
            print A[i]
        }
    }
     }'  <<<"$string"

Resulting

 $ bash f
 un
 dodecaedro
 per
 |
 tirare
 per
 i danni

Upvotes: 1

mklement0
mklement0

Reputation: 437062

You can try to use sed to simply replace two consecutive newlines (that result from set's parsing and output with \n as the separator) with \n| after the fact (but @Ed Morton's caveat re ambiguity applies):

f() {
  local IFS='|'
  local foo
  set -f # Disable glob expansion
  foo=( $@ ) # Deliberately unquoted
  set +f
  printf '%d\n' "${#foo[@]}"
  printf '%s\n' "${foo[@]}" | sed ':a; N; $!ba; s/\n\n/\n|/g'
}

f 'un|dodecaedro|per|||tirare|per|i danni'

Upvotes: 0

anubhava
anubhava

Reputation: 784898

Crude way to get this output from awk:

s='un|dodecaedro|per|||tirare|per|i danni'
awk '{p=$0; while ((n=index(p, "|"))) { m=(n==1)?n:n-1; print substr(p, 1, m);
      p=substr(p, n+1)}; print p }' <<< "$s"
un
dodecaedro
per
|
|
tirare
per
i danni

Upvotes: 1

Related Questions