nagothu
nagothu

Reputation: 39

Append nextline to current line until pattern matched in awk

Input file data:

"1","123","hh
KKK,111,ll
Jk"
"2","124","jj"

Output data:

"1","123","hh KKK,111,ll jk"
"2","124","jj"

Tried below code in awk file. still not working for desired output:

BEGIN{
      `FS="\",\"";
        record_lock_flag=0;
        total_feilds=3;
        tmp_field_count=0;
        tmp_rec_buff="";
        lines=0;
        }
        {
        if(NR>0)
        {
        if( record_lock_flag == 0 && NF == total_feilds && substr($NF,length($NF)-1,length($NF)) ~ /^"/  )
                 {
        print $0;
                }
        else
                {
        tmp_rec_buff=tmp_rec_buff$0 ;
        tmp_field_count=tmp_field_count+NF ;
        if ( $0 != "")
        { lines++ ;}
        rec_lock_flag=1 ;
                 if(tmp_field_count==exp_fields+lines-1){
                                print tmp_rec_buff;
                                record_lock_flag=0;
                                tmp_field_count=0;
                                tmp_rec_buff="";
                                lines=0;
                                                        }
                }
        }
        }
        END{
        }`

Upvotes: 0

Views: 169

Answers (4)

Carlos Pascual
Carlos Pascual

Reputation: 1126

With awk setting ORS:

awk '{ORS = (!/"$/) ? " " : "\n"} 1' file
"1","123","hh KKK,111,ll Jk"
"2","124","jj"

Upvotes: 0

anubhava
anubhava

Reputation: 786091

Using gnu-awk we can break records using text "\n" then remove \n from each record and finally append "\n" in the end using same ORS (assuming there are no blank fields with opening and closing quotes on separate lines):

awk -v RS='"\n("|$)' '{gsub(/\n/, " "); ORS=RT} 1' file

"1","123","hh KKK,111,ll Jk"
"2","124","jj"

Another version using gnu-awk if you already know number of fields in each record as shown in your question:

awk -v n=3 -v FPAT='"[^"]*"' 'p {$0 = p " " $0; p=""}
NF < n {p = $0; next} 1' file

"1","123","hh KKK,111,ll Jk"
"2","124","jj"

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133760

With your shown samples only, you could try following awk code. Written and tested with GNU awk.

awk -v RS="" -v FS="\n" '
{
  for(i=1;i<=NF;i++){
    sum+=gsub(/"/,"&",$i)
    val=(val?val OFS:"")$i
    if(sum%2==0){
      print val
      sum=0
      val=""
    }
  }
}
' Input_file

Explanation: Adding detailed explanation for above.

awk -v RS="" -v FS="\n" '    ##Starting awk program from here, setting RS as NULL and field separator as new line.
{
  for(i=1;i<=NF;i++){        ##Traversing through all fields here.
    sum+=gsub(/"/,"&",$i)    ##Globally substituting " with itself and keeping its count to sum variable.
    val=(val?val OFS:"")$i   ##Creating val which has current field in it and keep appending its value to it.
    if(sum%2==0){            ##Checking if sum is even number then do following.
      print val              ##Printing val here.
      sum=0                  ##Setting sum to 0 here.
      val=""                 ##Nullifying val here.
    }
  }
}
' Input_file                 ##Mentioning Input_file name here.

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 204558

Using any awk in any shell on every Unix box:

$ awk 'BEGIN{RS=ORS="\""} !(NR%2){gsub(/\n/," ")} 1' file
"1","123","hh KKK,111,ll Jk"
"2","124","jj"

See also What's the most robust way to efficiently parse CSV using awk?.

Upvotes: 3

Related Questions