Text manipulation using AWK

Question

Input file (example_file.txt):

chr20:1000026:T:C, 0.997, 0, 0.998, 0, 0.013, 0.980, 0.989, 1.000, 0, 0.995
chr20:10000775:A:G, 1.000, 0, 0.938, 0, 0, 0.982, 0, 0, 1.985, 1.180

Desired output (using awk):

chr20:1000026:T:C, C, T, 0.997, 0, 0.998, 0, 0.013, 0.980, 0.989, 1.000, 0, 0.995
chr20:10000775:A:G, G, A, 1.000, 0, 0.938, 0, 0, 0.982, 0, 0, 1.985, 1.180

I can get desired output with:

awk '{print $1}' example_file.txt > file1.tmp

awk -F: '{print $4",", $3","}' example_file.txt > file2.tmp

awk '{print $2, $3, $4, $5, $6, $7, $8, $9, $10, $11}' example_file.txt > file3.tmp

paste file1.tmp file2.tmp file3.tmp > output.file

output.file:

chr20:1000026:T:C, C, T, 0.997, 0, 0.998, 0, 0.013, 0.980, 0.989, 1.000, 0, 0.995
chr20:10000775:A:G, G, A, 1.000, 0, 0.938, 0, 0, 0.982, 0, 0, 1.985, 1.180

but this method is fragmented and tedious and the actual input files have >>11 columns.

James Brown · Accepted Answer

splitting $1 and prepending parts to $2:

$ awk '
BEGIN {
    FS=OFS=", "             # proper field delimiters
}
{
    n=split($1,a,/:/)       # get parts of first field
    for(i=3;i<=n;i++)       # from the 3rd part on
        $2=a[i] OFS $2      # prepend to 2nd field
}1' file                    # output

Output:

chr20:1000026:T:C, C, T, 0.997, 0, 0.998, 0, 0.013, 0.980, 0.989, 1.000, 0, 0.995
chr20:10000775:A:G, G, A, 1.000, 0, 0.938, 0, 0, 0.982, 0, 0, 1.985, 1.180

Text manipulation using AWK

Answers (1)

Related Questions