user19619903
user19619903

Reputation: 151

Awk fixed width columns and left leaning columns

I have a file named file1 consisting of 4350 lines and 12 columns, as shown below.

ATOM      1  CE1 LIG H   1      75.206  62.966  59.151  0.00  0.00      HAB1  
ATOM      2  NE2 LIG H   1      74.984  62.236  58.086  0.00  0.00      HAB1  
ATOM      3  CD2 LIG H   1      74.926  63.041  57.027  0.00  0.00      HAB1  
.
.
.
ATOM   4348  ZN  ZN2 H   1       1.886  22.818  51.215  0.00  0.00      HAC1  
ATOM   4349  ZN  ZN2 H   1      62.517  30.663   5.219  0.00  0.00      HAC1  
ATOM   4350  ZN  ZN2 H   1      59.442  35.851   2.791  0.00  0.00      HAC1   

I am using awk -v d="74.106" '{$7=sprintf("%0.3f", $7+d)} 1' file1 > file2 to add a value d to the 7th column of file1. After this, my file2 does not retain the correct formatting. A section of file2 is shown below.

ATOM 1 CE1 LIG H 1 149.312 62.966 59.151 0.00 0.00 HAB1
ATOM 2 NE2 LIG H 1 149.090 62.236 58.086 0.00 0.00 HAB1
ATOM 3 CD2 LIG H 1 149.032 63.041 57.027 0.00 0.00 HAB1
.
.
.
ATOM 4348 ZN ZN2 H 1 75.992 22.818 51.215 0.00 0.00 HAC1
ATOM 4349 ZN ZN2 H 1 136.623 30.663 5.219 0.00 0.00 HAC1
ATOM 4350 ZN ZN2 H 1 133.548 35.851 2.791 0.00 0.00 HAC1

I need my file2 to keep the same formatting as my file1, where only columns 2, 8, and 9 are left leaning. I have tried to use awk -v FIELDWIDTHS="7 6 4 4 4 5 8 8 8 6 6 10" '{print $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 $12}' to specify the maximum width for each of the 12th columns. This line does not change my file2. Moreover, I cannot find a way to make columns 2, 8, and 9 left leaning as in file1. How can I achieve these two things?

I appreciate any guidance. Thank you!

Upvotes: 0

Views: 82

Answers (2)

Fravadona
Fravadona

Reputation: 17098

Well, with the default FS, awk strips the duplicate spaces when you modify a field.

What you need to do first is to understand your ATOM record format:

COLUMNS DATA TYPE CONTENTS
1 - 6 Record name "ATOM "
7 - 11 Integer Atom serial number.
13 - 16 Atom Atom name.
17 Character Alternate location indicator.
18 - 20 Residue name Residue name.
22 Character Chain identifier.
23 - 26 Integer Residue sequence number.
27 AChar Code for insertion of residues.
31 - 38 Real(8.3) Orthogonal coordinates for X in Angstroms.
39 - 46 Real(8.3) Orthogonal coordinates for Y in Angstroms.
47 - 54 Real(8.3) Orthogonal coordinates for Z in Angstroms.
55 - 60 Real(6.2) Occupancy.
61 - 66 Real(6.2) Temperature factor (Default = 0.0).
73 - 76 LString(4) Segment identifier, left-justified.
77 - 78 LString(2) Element symbol, right-justified.
79 - 80 LString(2) Charge on the atom.

Then you can use substr for generating a modified record:

awk -v d="74.106" '
    /^ATOM  / {
        xCoord = sprintf( "%8.3f", substr($0,31,8) + d )
        $0 = substr($0,1,30) xCoord substr($0,39)
    }
    1
' file.pdb
ATOM      1  CE1 LIG H   1     149.312  62.966  59.151  0.00  0.00      HAB1  
ATOM      2  NE2 LIG H   1     149.090  62.236  58.086  0.00  0.00      HAB1  
ATOM      3  CD2 LIG H   1     149.032  63.041  57.027  0.00  0.00      HAB1  
.
.
.
ATOM   4348  ZN  ZN2 H   1      75.992  22.818  51.215  0.00  0.00      HAC1  
ATOM   4349  ZN  ZN2 H   1     136.623  30.663   5.219  0.00  0.00      HAC1  
ATOM   4350  ZN  ZN2 H   1     133.548  35.851   2.791  0.00  0.00      HAC1   

Upvotes: 3

sseLtaH
sseLtaH

Reputation: 11227

Using awk

$ awk -v d=74.106 '/ATOM/{sub($7,sprintf("%0.3f", $7+d))}1' input_file
ATOM      1  CE1 LIG H   1      149.312  62.966  59.151  0.00  0.00      HAB1  
ATOM      2  NE2 LIG H   1      149.090  62.236  58.086  0.00  0.00      HAB1  
ATOM      3  CD2 LIG H   1      149.032  63.041  57.027  0.00  0.00      HAB1  
.
.
.
ATOM   4348  ZN  ZN2 H   1       75.992  22.818  51.215  0.00  0.00      HAC1  
ATOM   4349  ZN  ZN2 H   1      136.623  30.663   5.219  0.00  0.00      HAC1  
ATOM   4350  ZN  ZN2 H   1      133.548  35.851   2.791  0.00  0.00      HAC1

Upvotes: 1

Related Questions