Reputation: 863

Dealing with linebreaks in while read line

Maybe I'm using the wrong tool for the job here...

My data looks like this (this is from a json file which has been converted to a csv):

"hostname1",1,""
"hostname2",1,""
"hostname3",0,"yay_some_text
more_text
more_text
"

The first column is the hostname, second is the exit code and the third the result. I usually do something like this and make a moderately pretty table:

cat tmp.file | ( while read line
do
name=$(echo $line | awk -F "," '{print $1}')
exit_code=$(echo $line | awk -F "," '{print $2}')
output=$(echo $line | awk -F "," '{print $3}')
#I can then do stuff with the output here and ultimately do this:
echo -e "|${name}\t|${exit_code}\t|${output}\t|"
done
)

However the third column is causing me no end of problems; I think regardless of what I do, the read line bit will make this impossible. Does anyone have a better method of sorting this? I'd ideally like to keep the linebreaks, but if thats going to be too hard, I'll happily replace them with commas.

Desired output (either is fine):

| hostname1 | 1 | |
| hostname2 | 1 | |
| hostname3 | 0 | yay_some_text
 more_text 
more_text |

| hostname1 | 1 | |
| hostname2 | 1 | |
| hostname3 | 0 | yay_some_text, more_text, more_text |

Upvotes: 0

Answers (3)

karakfa

Reputation: 67507

$ gawk -v RS='"\n' -v FPAT='[^,]*|"[^"]*"' -v OFS=' | ' '
           {gsub(/"/,""); $1=$1; print OFS $0 OFS}' file


 | hostname1 | 1 |  |
 | hostname2 | 1 |  |
 | hostname3 | 0 | yay_some_text
more_text
more_text
 |

Upvotes: 1

Ed Morton

Reputation: 204164

Whichever of these you prefer will work robustly* and efficiently using any awk in any shell on every UNIX box:

$ cat tst.awk
{ rec = rec $0 ORS }
/"$/ {
    gsub(/[[:space:]]*"[[:space:]]*/,"",rec)
    gsub(/,/," | ",rec)
    printf "| %s |\n", rec
    rec = ""
}

$ awk -f tst.awk file
| hostname1 | 1 |  |
| hostname2 | 1 |  |
| hostname3 | 0 | yay_some_text
more_text
more_text |

$ cat tst.awk
{ rec = rec $0 RS }
/"$/ {
    gsub(/[[:space:]]*"[[:space:]]*/,"",rec)
    gsub(/,/," | ",rec)
    gsub(RS,", ",rec)
    printf "| %s |\n", rec
    rec = ""
}

$ awk -f tst.awk file
| hostname1 | 1 |  |
| hostname2 | 1 |  |
| hostname3 | 0 | yay_some_text, more_text, more_text |

*robustly assuming your quoted strings never contain commas or escaped double quotes, i.e. it looks like the example you provided and your existing code relies on.

Upvotes: 4

Sam Daniel

Reputation: 1902

In your case, one way is , you can transform the file to a simpler structure before using

  awk '/[^"]$/ { printf("%s", $0); next } 1' tmp.file | ( while read line
  do
    name=$(echo $line | awk -F ',' '{print $1}')
    exit_code=$(echo $line | awk -F ',' '{print $2}')
    output=$(echo $line | awk -F ',' '{print $3}')
    #I can then do stuff with the output here and ultimately do this:
    echo -e "|${name}\t|${exit_code}\t|${output}\t|"
  done
  )

If all you want to do is to display as a table, you can use column utility

awk '/[^"]$/ { printf("%s", $0); next } 1' tmp.file | column -t -o "  |  " -s ,

If you are so particular about the starting and ending seperator '|', you can simply pipe the output of this command to a sed|awk.

Upvotes: 0

Dealing with linebreaks in while read line

Answers (3)

Related Questions