Reputation: 151
I have a file named file1 consisting of 4350 lines and 12 columns, as shown below.
ATOM 1 CE1 LIG H 1 75.206 62.966 59.151 0.00 0.00 HAB1
ATOM 2 NE2 LIG H 1 74.984 62.236 58.086 0.00 0.00 HAB1
ATOM 3 CD2 LIG H 1 74.926 63.041 57.027 0.00 0.00 HAB1
.
.
.
ATOM 4348 ZN ZN2 H 1 1.886 22.818 51.215 0.00 0.00 HAC1
ATOM 4349 ZN ZN2 H 1 62.517 30.663 5.219 0.00 0.00 HAC1
ATOM 4350 ZN ZN2 H 1 59.442 35.851 2.791 0.00 0.00 HAC1
I am using awk -v d="74.106" '{$7=sprintf("%0.3f", $7+d)} 1' file1 > file2
to add a value d to the 7th column of file1. After this, my file2 does not retain the correct formatting. A section of file2 is shown below.
ATOM 1 CE1 LIG H 1 149.312 62.966 59.151 0.00 0.00 HAB1
ATOM 2 NE2 LIG H 1 149.090 62.236 58.086 0.00 0.00 HAB1
ATOM 3 CD2 LIG H 1 149.032 63.041 57.027 0.00 0.00 HAB1
.
.
.
ATOM 4348 ZN ZN2 H 1 75.992 22.818 51.215 0.00 0.00 HAC1
ATOM 4349 ZN ZN2 H 1 136.623 30.663 5.219 0.00 0.00 HAC1
ATOM 4350 ZN ZN2 H 1 133.548 35.851 2.791 0.00 0.00 HAC1
I need my file2 to keep the same formatting as my file1, where only columns 2, 8, and 9 are left leaning.
I have tried to use awk -v FIELDWIDTHS="7 6 4 4 4 5 8 8 8 6 6 10" '{print $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 $12}'
to specify the maximum width for each of the 12th columns. This line does not change my file2. Moreover, I cannot find a way to make columns 2, 8, and 9 left leaning as in file1.
How can I achieve these two things?
I appreciate any guidance. Thank you!
Upvotes: 0
Views: 82
Reputation: 17098
Well, with the default FS
, awk
strips the duplicate spaces when you modify a field.
What you need to do first is to understand your ATOM record format:
COLUMNS | DATA TYPE | CONTENTS |
---|---|---|
1 - 6 | Record name | "ATOM " |
7 - 11 | Integer | Atom serial number. |
13 - 16 | Atom | Atom name. |
17 | Character | Alternate location indicator. |
18 - 20 | Residue name | Residue name. |
22 | Character | Chain identifier. |
23 - 26 | Integer | Residue sequence number. |
27 | AChar | Code for insertion of residues. |
31 - 38 | Real(8.3) | Orthogonal coordinates for X in Angstroms. |
39 - 46 | Real(8.3) | Orthogonal coordinates for Y in Angstroms. |
47 - 54 | Real(8.3) | Orthogonal coordinates for Z in Angstroms. |
55 - 60 | Real(6.2) | Occupancy. |
61 - 66 | Real(6.2) | Temperature factor (Default = 0.0). |
73 - 76 | LString(4) | Segment identifier, left-justified. |
77 - 78 | LString(2) | Element symbol, right-justified. |
79 - 80 | LString(2) | Charge on the atom. |
Then you can use substr
for generating a modified record:
awk -v d="74.106" '
/^ATOM / {
xCoord = sprintf( "%8.3f", substr($0,31,8) + d )
$0 = substr($0,1,30) xCoord substr($0,39)
}
1
' file.pdb
ATOM 1 CE1 LIG H 1 149.312 62.966 59.151 0.00 0.00 HAB1
ATOM 2 NE2 LIG H 1 149.090 62.236 58.086 0.00 0.00 HAB1
ATOM 3 CD2 LIG H 1 149.032 63.041 57.027 0.00 0.00 HAB1
.
.
.
ATOM 4348 ZN ZN2 H 1 75.992 22.818 51.215 0.00 0.00 HAC1
ATOM 4349 ZN ZN2 H 1 136.623 30.663 5.219 0.00 0.00 HAC1
ATOM 4350 ZN ZN2 H 1 133.548 35.851 2.791 0.00 0.00 HAC1
Upvotes: 3
Reputation: 11227
Using awk
$ awk -v d=74.106 '/ATOM/{sub($7,sprintf("%0.3f", $7+d))}1' input_file
ATOM 1 CE1 LIG H 1 149.312 62.966 59.151 0.00 0.00 HAB1
ATOM 2 NE2 LIG H 1 149.090 62.236 58.086 0.00 0.00 HAB1
ATOM 3 CD2 LIG H 1 149.032 63.041 57.027 0.00 0.00 HAB1
.
.
.
ATOM 4348 ZN ZN2 H 1 75.992 22.818 51.215 0.00 0.00 HAC1
ATOM 4349 ZN ZN2 H 1 136.623 30.663 5.219 0.00 0.00 HAC1
ATOM 4350 ZN ZN2 H 1 133.548 35.851 2.791 0.00 0.00 HAC1
Upvotes: 1