radas
radas

Reputation: 347

how to print 3rd field in 3rd column itself

In my file I have 3 fields, I want to print only the third field in the third column only but output is getting to the first row. Please check my file and output:

cat filename

1st field     2nd field    3rd field
---------     ---------    -----------
a,b,c,d       d,e,f,g,h    1,2,3,4,5,5

q,w,e,r       t,y,g,t,i    9,8,7,6,5,5

I'm using the following command to print the third field only in the third column

cat filename |awk '{print $3}' |tr ',' '\n' 

OUTPUT printing 3rd field strings in the 1st field place, i want that to print in only 3rd field area only

first field :-
---------------
1
2
3
4
5 
5

expected output

1st field     2nd field    3rd field
---------     ---------    -----------
a,b,c,d       d,e,f,g,h     1
                            2
                            3
                            4
                            5 
                            5

q,w,e,r       t,y,g,t,i     9
                            8
                            7
                            6 
                            5
                            5

Upvotes: 1

Views: 1767

Answers (3)

Akshay Hegde
Akshay Hegde

Reputation: 16997

Input

 [akshay@localhost tmp]$ cat file
 1st field     2nd field    3rd field
 ---------     ---------    -----------
 a,b,c,d       d,e,f,g,h    1,2,3,4,5,5

 q,w,e,r       t,y,g,t,i    9,8,7,6,5,5

Script

 [akshay@localhost tmp]$ cat test.awk
    NR<3 || !NF{ print; next}
    { 
        split($0,D,/[^[:space:]]*/)
        c1=sprintf("%*s",length($1),"")
        c2=sprintf("%*s",length($2),"")
        split($3,A,/,/)
        for(i=1; i in A; i++)
        {   
            if(i==2)
            {
                $1 = c1
                $2 = c2
            }
            printf("%s%s%s%s%d\n",$1,D[2],$2,D[3],A[i]) 
        }
     }

Output

 [akshay@localhost tmp]$ awk -f test.awk file
 1st field     2nd field    3rd field
 ---------     ---------    -----------
 a,b,c,d       d,e,f,g,h    1
                            2
                            3
                            4
                            5
                            5

 q,w,e,r       t,y,g,t,i    9
                            8
                            7
                            6
                            5
                            5

Explanation

  • NR<3 || !NF{ print; next}

NR gives you the total number of records being processed or line number, in short NR variable has line number.

NF gives you the total number of fields in a record.

The next statement forces awk to immediately stop processing the current record and go on to the next record.

If line number is less than 3 or not NF (meaning no fields in record that is blank line), print current record and go to next record.

  • split($0,D,/[^[:space:]]*/)

Since we are interested to preserve the formatting, so we are saving separators between fields on array D here, if you have GNU awk you can make use of 4th arg for split() - it lets you split the line into 2 arrays, one of the fields and the other of the separators between the fields and then you can just operate on the fields array and print using the separators array between each field array element to rebuild the original $0.

  • c1=sprintf("%*s",length($1),"") and c2=sprintf("%*s",length($2),"")

Here sprintf function is used to fill space char of field ($1 or $2) length.

  • split($3,A,/,/)

split(string, array [, fieldsep [, seps ] ])

Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array[1], the second piece in array[2], and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records). If fieldsep is omitted, the value of FS is used. split() returns the number of elements created.

Loop till as long as i in A is true, I just came to know that i=1 and i++ control the order of traversal of the array, Thanks to Ed Morton

  • if(i==2) { $1 = c1 $2 = c2 }

when i = 1 we print a,b,c,d and d,e,f,g,h, in next iteration we modify $1 and $2 value with c1 and c2 we created above since you are interested to show only once as requested.

  • printf("%s%s%s%s%d\n",$1,D[2],$2,D[3],A[i])

Finally print field1 ($1), separator between field1 and field2 to we saved above, that is D[2], field2 ($2), separator between field2 and field3 and array A element only by one which we created from (split($3,A,/,/)).

Upvotes: 5

Ed Morton
Ed Morton

Reputation: 203209

$ cat tst.awk
NR<3 || !NF { print; next }
{
    front = gensub(/((\S+\s+){2}).*/,"\\1","")
    split($3,a,/,/)
    for (i=1;i in a;i++) {
        print front a[i]
        gsub(/\S/," ",front)
    }
}

$ awk -f tst.awk file
1st field     2nd field    3rd field
---------     ---------    -----------
a,b,c,d       d,e,f,g,h    1
                           2
                           3
                           4
                           5
                           5

q,w,e,r       t,y,g,t,i    9
                           8
                           7
                           6
                           5
                           5

The above uses GNU awk for gensub(), with other awks use match()+substr(). It also uses \S and \s shorthand for [^[:space:]] and [[:space:]].

Upvotes: 1

fedorqui
fedorqui

Reputation: 289505

Considering the columns are tab separated, I would say:

awk 'BEGIN{FS=OFS="\t"}
     NR<=2 || !NF {print; next}
     NR>2{n=split($3,a,",")
          for (i=1;i<=n; i++)
              print (i==1?$1 OFS $2:"" OFS ""), a[i]
         }' file
  • This prints the 1st, 2nd and empty lines normally
  • Then, slices the 3rd field using the comma as separator.
  • Finally, loops through the amount of pieces printing each one every time; it prints the first two columns the first time, then just the last value.

Test

$ awk 'BEGIN{FS=OFS="\t"} NR<=2 || !NF {print; next} NR>2{n=split($3,a,","); for (i=1;i<=n; i++) print (i==1?$1 OFS $2:"" OFS ""), a[i]}' a
1st field   2nd field   3rd field
---------   ---------   -----------
a,b,c,d d,e,f,g,h   1
        2
        3
        4
        5
        5

q,w,e,r t,y,g,t,i   9
        8
        7
        6
        5
        5

Note the output is a bit ugly, since tab separating the columns lead them like this.

Upvotes: 0

Related Questions