user10657934
user10657934

Reputation: 157

manipulating columns in a text file in awk

I have a tab separated text file and want to do some math operation on one column and make a new tab separated text file.

this is an example of my file:

chr1    144520803   144520804   12  chr1        144520813   58
chr1    144520840   144520841   12  chr1        144520845   36
chr1    144520840   144520841   12  chr1        144520845   36
chr1    144520848   144520849   14  chr1        144520851   32
chr1    144520848   144520849   14  chr1        144520851   32

i want to change the 4th column. in fact I want to divide every single element in the 4th column by sum of all elements in the 4th column and then multiply by 1000000 . like the expected output.

expected output:

chr1    144520803   144520804   187500  chr1        144520813   58
chr1    144520840   144520841   187500  chr1        144520845   36
chr1    144520840   144520841   187500  chr1        144520845   36
chr1    144520848   144520849   218750  chr1        144520851   32
chr1    144520848   144520849   218750  chr1        144520851   32

I am trying to do that in awk using the following command but it does not return what I want. do you know how to fix it:

awk '{print $1 "\t" $2 "\t" $3 "\t" $4/{sum+=$4}*1000000 "\t" $5 "\t" $6 "\t" $7}'  myfile.txt > new_file.txt

Upvotes: 0

Views: 60

Answers (1)

karakfa
karakfa

Reputation: 67467

you need two passes, one to compute the sum and then to scale the field

something like this

$ awk -v OFS='\t' 'NR==FNR {sum+=$4; next}
                           {$4*=(1000000/sum)}1' file{,} > newfile

Upvotes: 1

Related Questions