Andrey Khmelev
Andrey Khmelev

Reputation: 1161

How to concatenate lines by unix commands

I have next file myfile.txt

"field1","val1","val2","val3"
"field2","val1","val2","val3"
"field3","val1","va
  l2","va
  l3"
"field4","val1","val2","val3"

I want to do this file in normal view like that:

"field1","val1","val2","val3"
"field2","val1","val2","val3"
"field3","val1","val2","val3"
"field4","val1","val2","val3"

So, I am trying to do that with next commands:

filename=myfile.txt

while read line
do
   found=$(grep '^[^"]')
   if [ "$found" ]; then          
      #think here must be command "paste"      
   fi   
done < $filename

but something wrong. Please help me, I am not guru in unix commands

Upvotes: 0

Views: 81

Answers (3)

anubhava
anubhava

Reputation: 785611

Without knowing number of fields in input, you can use this gnu-awk solution using FPAT and gensub:

awk -v RS= -v FPAT='("[^"]*"|[^,"]+),?' -v OFS= '{
      for (h=1; h<=NF; h++) $h = gensub(/([^"])\n[[:blank:]]*/, "\\1", "g", $h); } 1' file

"field1","val1","val2","val3"
"field2","val1","val2","val3"
"field3","val1","val2","val3"
"field4","val1","val2","val3"

To save changes back to file use:

awk -i inplace -v RS= -v FPAT='("[^"]*"|[^,"]+),?' -v OFS= '{
       for (h=1; h<=NF; h++) $h = gensub(/([^"])\n[[:blank:]]*/, "\\1", "g", $h); } 1' file

Upvotes: 0

RomanPerekhrest
RomanPerekhrest

Reputation: 92874

sed solution:

sed -Ez 's/[[:space:]]+//g; s/""/","/g; s/(([^,]+,){3})([^,]+),/\1\3\n/g; $a\\' myfile.txt
  • -z - treat the input as lines separated by null(zero) character instead of newlines

  • s/[[:space:]]+//g - remove whitespaces between/within lines

  • s/""/","/g - separating adjacent fields which were wrapped/breaked

  • s/(([^,]+,){3})([^,]+),/\1\3\n/g - set linebreak (record separator) on each 4th field

  • $a\\ - append the final newline at the end of the content


The output:

"field1","val1","val2","val3"
"field2","val1","val2","val3"
"field3","val1","val2","val3"
"field4","val1","val2","val3"

Upvotes: 1

Carlos Afonso
Carlos Afonso

Reputation: 1957

Try this:

filename=$1

while read -r line
do
   found=$found$(echo $line | grep '[^"]')
   if [[ -n $found && $found == *\" ]]; then
       echo $found;
       found=''
   fi
done < "$filename"
  1. The variable $found is always appended to itself this way you'll join the "broken lines".
  2. In the if it's then checked if $found is not empty (-n does just that) and then check if $found ends with a quote as suggested by @Barmar

If it does end with a quote that's the end so you echo $found set variable to empty

Upvotes: 1

Related Questions