Reputation: 13
I have a pdb containing a protein and one single ligand. I do not like how the ligand's hydrogens are named (1H2, 2H2, 1H3, 2H3, ...) and I would like something like H1, H2, H3, H4, ...
I wrote the following script, the problem is that it seems it's not possible to assign new atom ids. The change is reflected by Atom.id, but this change is not present in the output pdb structure, which retains the old names.
from Bio.PDB import PDBParser, PDBIO
io = PDBIO()
target_pdb_path = 'mypdb.pdb'
pdb = PDBParser(QUIET=True).get_structure('target', target_pdb_path)[0]
hydrogens = []
for atom in pdb.get_atoms():
if atom.parent.id[0].startswith('H_'):
# The atom is an hydrogen and is an HETATM record
if 'H' in atom.name:
hydrogens.append(atom)
# Rename hydrogens of the ligand
for h_num, h in enumerate(hydrogens, 1):
# this is working, but the change is not present in the output pdb structure
h.id = f'H{h_num} '
io.set_structure(pdb)
io.save('test.pdb')
See how assigning h.id did not change the full_id. I tried also replacing the whole full_id tuple, but also this does not work. Unfortunately it seems there's no method to change the id just like other features as Atom.set_bfactor(), Atom.set_coord() etc.
In [233]: h.full_id # still the old id 3H18
Out[233]: ('target', 0, 'X', ('H_EOD', 401, ' '), ('3H18', ' '))
In [234]: h.id # my new id
Out[234]: 'H29 '
Does anyone have a solution? Many thanks!
Upvotes: 1
Views: 395
Reputation: 1
I experienced the same issue while attempting to rename select carbon atoms. There is a very simple way to change atom names.
Changes made to atoms.name are not able to written to a file. However, changes can be properly saved using atoms.fullname (https://biopython.org/docs/1.76/api/Bio.PDB.Atom.html):
fullname (string) – full atom name, including spaces, e.g. ” CA “. Normally these spaces are stripped from the atom name.
I'm unsure if this saving issue is related to the BioPython class hierarchy, or when atom.name strips spaces.
Here is my code for changing the carbon atom names for select residues:
from Bio.PDB import PDBParser, PDBIO
parser = PDBParser(PERMISSIVE=1)
structure_id = 'complex'
filename = "complex.pdb"
structure = parser.get_structure(structure_id, filename)
model = structure[0]
for chain in model:
for res in chain:
if res.get_resname() in ("ILE", "LEU", "PHE", "TRP", "TYR"):
for a in res:
if a.fullname == ' CD ':
a.fullname = ' CD1'
print("changed to " + a.fullname)
io = PDBIO()
io.set_structure(structure)
io.save("complex-changed-CD.pdb")
Note that the spacing before and after atom CD must match exactly for the if statement to be true. You should be able to easily modify this if statement to match H atoms.
Hope this helps!
Upvotes: 0
Reputation: 3096
Given @nannarito concerns, I tried to find a way that doesn't need creating new Atom
objects, I wasn't able to get a grasp of what goes behind the wheels of Biopython PDB module (I tried but it eludes me). After various attempts I ended up with the following code:
from Bio.PDB import PDBParser, PDBIO
target_pdb_path = 'small_pdb_h_gtp_no-connect_numb.pdb'
pdb = PDBParser(QUIET=True).get_structure('target', target_pdb_path)
hydrogens = []
for atom in pdb[0].get_atoms():
if atom.parent.get_resname() == 'GTP' :
if atom.parent.id[0].startswith('H_'):
print(atom.parent.id , atom.name)
# The atom is an hydrogen and is an HETATM record
if 'H' in atom.name:
print('ok')
hydrogens.append(atom)
print('\n\nhydrogens : \n ', hydrogens,'\n\n')
for h_num, h in enumerate(hydrogens, 1):
setattr( h , 'fullname' , f'W{h_num}' ) ## or h.fullname = f'W{h_num}'
print(h.serial_number, h.name , h.id , h.full_id , h.level , h.parent)
atoms = pdb.get_atoms()
for h in atoms :
print(h.serial_number, h.name , h.id , h.full_id , h.level , h.parent)
io = PDBIO()
io.set_structure(pdb)
io.save('test_new_approach.pdb', preserve_atom_numbering = False)
Using same input of answer above I get as output file test_new_approach.pdb
:
......
.....
....
...
HETATM 147 C5 GTP A 180 20.554 34.737 -11.307 1.00 0.00 C
HETATM 148 C6 GTP A 180 19.183 34.712 -11.659 1.00 0.00 C
HETATM 149 O6 GTP A 180 18.205 34.448 -10.957 1.00 0.00 O
HETATM 150 N7 GTP A 180 21.168 34.483 -10.079 1.00 0.00 N
HETATM 151 C8 GTP A 180 22.443 34.655 -10.325 1.00 0.00 C
HETATM 152 N9 GTP A 180 22.724 35.005 -11.630 1.00 0.00 N
HETATM 153 W1 GTP A 180 27.642 33.664 -10.448 1.00 0.00 H
HETATM 154 W2 GTP A 180 26.472 32.436 -10.894 1.00 0.00 H
HETATM 155 W3 GTP A 180 26.872 34.003 -12.692 1.00 0.00 H
HETATM 156 W4 GTP A 180 27.038 36.109 -10.945 1.00 0.00 H
HETATM 157 W5 GTP A 180 26.303 36.091 -13.672 1.00 0.00 H
HETATM 158 W6 GTP A 180 24.683 36.247 -10.440 1.00 0.00 H
HETATM 159 W7 GTP A 180 24.926 37.660 -12.845 1.00 0.00 H
HETATM 160 W8 GTP A 180 23.874 35.594 -13.231 1.00 0.00 H
HETATM 161 W9 GTP A 180 18.670 35.593 -15.377 1.00 0.00 H
HETATM 162 W10 GTP A 180 20.293 35.851 -15.834 1.00 0.00 H
HETATM 163 W11 GTP A 180 27.124 35.555 -7.891 1.00 0.00 H
HETATM 164 W12 GTP A 180 26.059 32.241 -5.339 1.00 0.00 H
HETATM 165 W13 GTP A 180 22.030 35.588 -14.193 1.00 0.00 H
HETATM 166 W14 GTP A 180 30.718 31.035 -4.497 1.00 0.00 H
HETATM 167 W15 GTP A 180 23.174 34.539 -9.606 1.00 0.00 H
TER 168 GTP A 180
END
So setattr( h , 'fullname' , f'W{h_num}' )
seems to do the trick, but for some reasons inexplicable to me, this:
atoms = pdb.get_atoms()
for h in atoms :
print(h.serial_number, h.name , h.id , h.full_id , h.level , h.parent)
bit, used after the setattr( h , 'fullname' , f'W{h_num}' )
before or after the:
io = PDBIO()
io.set_structure(pdb)
still produces:
......
.....
....
...
295 H10 H10 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H10', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
296 H11 H11 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H11', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
297 H12 H12 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H12', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
298 H13 H13 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H13', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
299 H14 H14 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H14', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
300 H15 H15 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H15', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
... the original question gets more intriguing.
I think I got something more:
needed to add h.name = f'W{h_num}'
too, like:
h.fullname = f'W{h_num}' # or setattr( h , 'fullname' , f'W{h_num}' )
h.name = f'W{h_num}' # or setattr( h , 'name' , f'W{h_num}')
to have:
......
.....
....
...
299 W14 H14 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H14', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
('target', 0, 'A', ('H_GTP', 180, ' '), ('W14', ' '))
300 W15 H15 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H15', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
('target', 0, 'A', ('H_GTP', 180, ' '), ('W15', ' '))
when using:
print(h.serial_number, h.name , h.id , h.full_id , h.level , h.parent)
print(h.get_full_id())
so h.get_full_id()
returns the updated atom name
but still get old h.id
when using:
atoms = pdb.get_atoms()
for h in atoms :
print(h.serial_number, h.name , h.id , h.full_id , h.level , h.parent)
.. so to me it's like something is going on on the Structure
object and its parent/child relationships, since it can be corrected by using:
par = h.parent
par.detach_child(h.id)
h.fullname = f'W{h_num}' # or setattr( h , 'fullname' , f'W{h_num}' )
h.name = f'W{h_num}' # or setattr( h , 'name' , f'W{h_num}')
par.add(h)
but still atom.id
doesn't get reinitialized, see:
atoms = pdb.get_atoms()
for h in atoms :
print(h.serial_number, h.name , h.id , h.full_id , h.level , h.parent)
Output:
299 W14 H14 ('target', 0, 'A', ('H_GTP', 180, ' '), ('W14', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
300 W15 H15 ('target', 0, 'A', ('H_GTP', 180, ' '), ('W15', ' ')) A <Residue GTP het=H_GTP resseq=180 icode= >
In the end, try using:
par = h.parent
par.detach_child(h.id)
h.fullname = f'W{h_num}'
h.name = f'W{h_num}'
h.id = f'W{h_num}'
par.add(h)
... but cannot guarantee that something somewhere isn't wrong.
Upvotes: -1
Reputation: 3096
ok my attempt , needed to create my input and changed output: change atom names H1 ... H15
to X1 .. X15
, here input
small_pdb_h_gtp_no-connect_numb.pdb
file :
COMPND UNNAMED
AUTHOR GENERATED BY OPEN BABEL 3.1.1
ATOM 1 N VAL A 6 20.799 29.221 8.701 1.00 0.00 N
ATOM 2 CA VAL A 6 20.474 28.731 7.364 1.00 0.00 C
ATOM 3 C VAL A 6 21.733 28.489 6.533 1.00 0.00 C
ATOM 4 O VAL A 6 22.566 29.380 6.440 1.00 0.00 O
ATOM 5 CB VAL A 6 19.553 29.779 6.711 1.00 0.00 C
ATOM 6 CG1 VAL A 6 19.327 29.527 5.222 1.00 0.00 C
ATOM 7 CG2 VAL A 6 18.217 29.772 7.421 1.00 0.00 C
ATOM 8 N VAL A 7 21.974 27.342 5.919 1.00 0.00 N
ATOM 9 CA VAL A 7 23.122 27.169 5.031 1.00 0.00 C
ATOM 10 C VAL A 7 22.555 27.122 3.620 1.00 0.00 C
ATOM 11 O VAL A 7 21.609 26.378 3.344 1.00 0.00 O
ATOM 12 CB VAL A 7 23.896 25.869 5.351 1.00 0.00 C
ATOM 13 CG1 VAL A 7 25.132 25.702 4.504 1.00 0.00 C
ATOM 14 CG2 VAL A 7 24.334 25.857 6.792 1.00 0.00 C
ATOM 15 N VAL A 8 23.076 27.936 2.718 1.00 0.00 N
ATOM 16 CA VAL A 8 22.670 27.904 1.328 1.00 0.00 C
ATOM 17 C VAL A 8 23.672 27.034 0.554 1.00 0.00 C
ATOM 18 O VAL A 8 24.852 27.397 0.465 1.00 0.00 O
ATOM 19 CB VAL A 8 22.621 29.343 0.825 1.00 0.00 C
ATOM 20 CG1 VAL A 8 22.080 29.353 -0.590 1.00 0.00 C
ATOM 21 CG2 VAL A 8 21.725 30.215 1.697 1.00 0.00 C
ATOM 22 N LEU A 9 23.250 25.888 0.004 1.00 0.00 N
ATOM 23 CA LEU A 9 24.132 24.905 -0.620 1.00 0.00 C
ATOM 24 C LEU A 9 23.960 24.767 -2.131 1.00 0.00 C
ATOM 25 O LEU A 9 22.841 24.930 -2.621 1.00 0.00 O
ATOM 26 CB LEU A 9 23.843 23.559 -0.010 1.00 0.00 C
ATOM 27 CG LEU A 9 24.288 23.347 1.414 1.00 0.00 C
ATOM 28 CD1 LEU A 9 23.675 22.070 1.962 1.00 0.00 C
ATOM 29 CD2 LEU A 9 25.809 23.357 1.478 1.00 0.00 C
ATOM 30 N GLY A 10 24.997 24.458 -2.908 1.00 0.00 N
ATOM 31 CA GLY A 10 24.827 24.183 -4.325 1.00 0.00 C
ATOM 32 C GLY A 10 26.134 24.261 -5.102 1.00 0.00 C
ATOM 33 O GLY A 10 27.188 24.556 -4.534 1.00 0.00 O
ATOM 34 N SER A 11 26.094 24.027 -6.416 1.00 0.00 N
ATOM 35 CA SER A 11 27.280 24.074 -7.267 1.00 0.00 C
ATOM 36 C SER A 11 27.618 25.490 -7.674 1.00 0.00 C
ATOM 37 O SER A 11 26.824 26.425 -7.519 1.00 0.00 O
ATOM 38 CB SER A 11 27.073 23.274 -8.536 1.00 0.00 C
ATOM 39 OG SER A 11 26.744 21.934 -8.227 1.00 0.00 O
ATOM 40 N GLY A 12 28.805 25.674 -8.235 1.00 0.00 N
ATOM 41 CA GLY A 12 29.215 26.993 -8.714 1.00 0.00 C
ATOM 42 C GLY A 12 28.280 27.618 -9.752 1.00 0.00 C
ATOM 43 O GLY A 12 27.692 26.945 -10.605 1.00 0.00 O
ATOM 44 N GLY A 13 28.091 28.929 -9.645 1.00 0.00 N
ATOM 45 CA GLY A 13 27.363 29.714 -10.618 1.00 0.00 C
ATOM 46 C GLY A 13 25.862 29.638 -10.485 1.00 0.00 C
ATOM 47 O GLY A 13 25.218 30.556 -10.943 1.00 0.00 O
ATOM 48 N VAL A 14 25.245 28.666 -9.827 1.00 0.00 N
ATOM 49 CA VAL A 14 23.798 28.557 -9.709 1.00 0.00 C
ATOM 50 C VAL A 14 23.042 29.783 -9.203 1.00 0.00 C
ATOM 51 O VAL A 14 21.827 29.901 -9.354 1.00 0.00 O
ATOM 52 CB VAL A 14 23.452 27.347 -8.833 1.00 0.00 C
ATOM 53 CG1 VAL A 14 24.080 26.099 -9.412 1.00 0.00 C
ATOM 54 CG2 VAL A 14 23.860 27.539 -7.373 1.00 0.00 C
ATOM 55 N GLY A 15 23.716 30.722 -8.558 1.00 0.00 N
ATOM 56 CA GLY A 15 23.016 31.867 -8.019 1.00 0.00 C
ATOM 57 C GLY A 15 22.880 31.944 -6.498 1.00 0.00 C
ATOM 58 O GLY A 15 22.118 32.782 -6.042 1.00 0.00 O
ATOM 59 N LYS A 16 23.638 31.182 -5.686 1.00 0.00 N
ATOM 60 CA LYS A 16 23.597 31.199 -4.210 1.00 0.00 C
ATOM 61 C LYS A 16 23.834 32.561 -3.602 1.00 0.00 C
ATOM 62 O LYS A 16 23.068 33.033 -2.760 1.00 0.00 O
ATOM 63 CB LYS A 16 24.593 30.232 -3.570 1.00 0.00 C
ATOM 64 CG LYS A 16 24.311 28.785 -3.934 1.00 0.00 C
ATOM 65 CD LYS A 16 25.192 27.838 -3.179 1.00 0.00 C
ATOM 66 CE LYS A 16 26.652 28.055 -3.425 1.00 0.00 C
ATOM 67 NZ LYS A 16 26.949 27.601 -4.751 1.00 0.00 N1+
ATOM 68 N SER A 17 24.877 33.238 -4.071 1.00 0.00 N
ATOM 69 CA SER A 17 25.190 34.560 -3.573 1.00 0.00 C
ATOM 70 C SER A 17 24.268 35.615 -4.118 1.00 0.00 C
ATOM 71 O SER A 17 24.071 36.602 -3.432 1.00 0.00 O
ATOM 72 CB SER A 17 26.604 34.908 -3.927 1.00 0.00 C
ATOM 73 OG SER A 17 27.409 33.729 -3.728 1.00 0.00 O
ATOM 74 N ALA A 18 23.680 35.451 -5.313 1.00 0.00 N
ATOM 75 CA ALA A 18 22.810 36.455 -5.871 1.00 0.00 C
ATOM 76 C ALA A 18 21.496 36.379 -5.125 1.00 0.00 C
ATOM 77 O ALA A 18 20.955 37.417 -4.776 1.00 0.00 O
ATOM 78 CB ALA A 18 22.613 36.235 -7.354 1.00 0.00 C
ATOM 79 N LEU A 19 20.989 35.191 -4.814 1.00 0.00 N
ATOM 80 CA LEU A 19 19.844 35.014 -3.938 1.00 0.00 C
ATOM 81 C LEU A 19 20.085 35.556 -2.524 1.00 0.00 C
ATOM 82 O LEU A 19 19.305 36.384 -2.066 1.00 0.00 O
ATOM 83 CB LEU A 19 19.470 33.531 -3.907 1.00 0.00 C
ATOM 84 CG LEU A 19 18.824 32.928 -5.141 1.00 0.00 C
ATOM 85 CD1 LEU A 19 18.781 31.438 -5.031 1.00 0.00 C
ATOM 86 CD2 LEU A 19 17.401 33.437 -5.328 1.00 0.00 C
ATOM 87 N THR A 20 21.156 35.175 -1.820 1.00 0.00 N
ATOM 88 CA THR A 20 21.471 35.686 -0.501 1.00 0.00 C
ATOM 89 C THR A 20 21.646 37.203 -0.512 1.00 0.00 C
ATOM 90 O THR A 20 20.988 37.911 0.261 1.00 0.00 O
ATOM 91 CB THR A 20 22.726 34.996 0.064 1.00 0.00 C
ATOM 92 CG2 THR A 20 23.091 35.495 1.437 1.00 0.00 C
ATOM 93 OG1 THR A 20 22.462 33.599 0.110 1.00 0.00 O
ATOM 94 N VAL A 21 22.471 37.750 -1.410 1.00 0.00 N
ATOM 95 CA VAL A 21 22.686 39.189 -1.491 1.00 0.00 C
ATOM 96 C VAL A 21 21.419 39.954 -1.866 1.00 0.00 C
ATOM 97 O VAL A 21 21.240 41.085 -1.417 1.00 0.00 O
ATOM 98 CB VAL A 21 23.871 39.505 -2.398 1.00 0.00 C
ATOM 99 CG1 VAL A 21 24.142 40.980 -2.468 1.00 0.00 C
ATOM 100 CG2 VAL A 21 25.140 38.840 -1.868 1.00 0.00 C
ATOM 101 N GLN A 22 20.472 39.394 -2.613 1.00 0.00 N
ATOM 102 CA GLN A 22 19.197 40.059 -2.841 1.00 0.00 C
ATOM 103 C GLN A 22 18.364 40.109 -1.555 1.00 0.00 C
ATOM 104 O GLN A 22 17.817 41.169 -1.258 1.00 0.00 O
ATOM 105 CB GLN A 22 18.451 39.349 -3.987 1.00 0.00 C
ATOM 106 CG GLN A 22 17.112 39.921 -4.504 1.00 0.00 C
ATOM 107 CD GLN A 22 17.229 41.246 -5.220 1.00 0.00 C
ATOM 108 NE2 GLN A 22 18.028 41.314 -6.268 1.00 0.00 N
ATOM 109 OE1 GLN A 22 16.583 42.229 -4.878 1.00 0.00 O
ATOM 110 N PHE A 23 18.259 39.026 -0.754 1.00 0.00 N
ATOM 111 CA PHE A 23 17.568 39.052 0.531 1.00 0.00 C
ATOM 112 C PHE A 23 18.183 40.098 1.444 1.00 0.00 C
ATOM 113 O PHE A 23 17.500 40.989 1.922 1.00 0.00 O
ATOM 114 CB PHE A 23 17.614 37.706 1.233 1.00 0.00 C
ATOM 115 CG PHE A 23 16.806 37.646 2.525 1.00 0.00 C
ATOM 116 CD1 PHE A 23 15.460 37.952 2.534 1.00 0.00 C
ATOM 117 CD2 PHE A 23 17.420 37.237 3.694 1.00 0.00 C
ATOM 118 CE1 PHE A 23 14.729 37.821 3.695 1.00 0.00 C
ATOM 119 CE2 PHE A 23 16.685 37.107 4.854 1.00 0.00 C
ATOM 120 CZ PHE A 23 15.341 37.392 4.850 1.00 0.00 C
HETATM 121 PA GTP A 180 26.277 33.726 -8.045 1.00 0.00 P
HETATM 122 PB GTP A 180 27.017 31.171 -6.766 1.00 0.00 P
HETATM 123 PG GTP A 180 29.710 30.132 -5.989 1.00 0.00 P
HETATM 124 C5' GTP A 180 26.615 33.475 -10.679 1.00 0.00 C
HETATM 125 O5' GTP A 180 25.804 33.834 -9.555 1.00 0.00 O
HETATM 126 C4' GTP A 180 26.219 34.288 -11.894 1.00 0.00 C
HETATM 127 O4' GTP A 180 24.826 34.017 -12.143 1.00 0.00 O
HETATM 128 C3' GTP A 180 26.372 35.802 -11.724 1.00 0.00 C
HETATM 129 O3' GTP A 180 26.880 36.347 -12.936 1.00 0.00 O
HETATM 130 C2' GTP A 180 24.932 36.243 -11.481 1.00 0.00 C
HETATM 131 O2' GTP A 180 24.719 37.581 -11.901 1.00 0.00 O
HETATM 132 C1' GTP A 180 24.069 35.240 -12.240 1.00 0.00 C
HETATM 133 N1 GTP A 180 19.000 35.036 -13.013 1.00 0.00 N
HETATM 134 O1A GTP A 180 25.089 33.867 -7.187 1.00 0.00 O1-
HETATM 135 O1B GTP A 180 26.072 30.050 -6.958 1.00 0.00 O1-
HETATM 136 O1G GTP A 180 29.197 28.937 -5.265 1.00 0.00 O
HETATM 137 C2 GTP A 180 20.022 35.339 -13.903 1.00 0.00 C
HETATM 138 N2 GTP A 180 19.627 35.619 -15.147 1.00 0.00 N
HETATM 139 O2A GTP A 180 27.427 34.635 -7.843 1.00 0.00 O
HETATM 140 O2B GTP A 180 26.960 31.913 -5.483 1.00 0.00 O
HETATM 141 O2G GTP A 180 30.881 29.816 -6.827 1.00 0.00 O1-
HETATM 142 N3 GTP A 180 21.301 35.367 -13.569 1.00 0.00 N
HETATM 143 O3A GTP A 180 26.807 32.212 -7.961 1.00 0.00 O
HETATM 144 O3B GTP A 180 28.517 30.631 -6.995 1.00 0.00 O
HETATM 145 O3G GTP A 180 30.013 31.278 -5.117 1.00 0.00 O
HETATM 146 C4 GTP A 180 21.489 35.054 -12.257 1.00 0.00 C
HETATM 147 C5 GTP A 180 20.554 34.737 -11.307 1.00 0.00 C
HETATM 148 C6 GTP A 180 19.183 34.712 -11.659 1.00 0.00 C
HETATM 149 O6 GTP A 180 18.205 34.448 -10.957 1.00 0.00 O
HETATM 150 N7 GTP A 180 21.168 34.483 -10.079 1.00 0.00 N
HETATM 151 C8 GTP A 180 22.443 34.655 -10.325 1.00 0.00 C
HETATM 152 N9 GTP A 180 22.724 35.005 -11.630 1.00 0.00 N
HETATM 286 H1 GTP A 180 27.642 33.664 -10.448 1.00 0.00 H
HETATM 287 H2 GTP A 180 26.472 32.436 -10.894 1.00 0.00 H
HETATM 288 H3 GTP A 180 26.872 34.003 -12.692 1.00 0.00 H
HETATM 289 H4 GTP A 180 27.038 36.109 -10.945 1.00 0.00 H
HETATM 290 H5 GTP A 180 26.303 36.091 -13.672 1.00 0.00 H
HETATM 291 H6 GTP A 180 24.683 36.247 -10.440 1.00 0.00 H
HETATM 292 H7 GTP A 180 24.926 37.660 -12.845 1.00 0.00 H
HETATM 293 H8 GTP A 180 23.874 35.594 -13.231 1.00 0.00 H
HETATM 294 H9 GTP A 180 18.670 35.593 -15.377 1.00 0.00 H
HETATM 295 H10 GTP A 180 20.293 35.851 -15.834 1.00 0.00 H
HETATM 296 H11 GTP A 180 27.124 35.555 -7.891 1.00 0.00 H
HETATM 297 H12 GTP A 180 26.059 32.241 -5.339 1.00 0.00 H
HETATM 298 H13 GTP A 180 22.030 35.588 -14.193 1.00 0.00 H
HETATM 299 H14 GTP A 180 30.718 31.035 -4.497 1.00 0.00 H
HETATM 300 H15 GTP A 180 23.174 34.539 -9.606 1.00 0.00 H
using code :
from Bio.PDB import PDBParser, PDBIO
from Bio.PDB.Atom import Atom
io = PDBIO()
target_pdb_path = 'small_pdb_h_gtp_no-connect_numb.pdb'
pdb = PDBParser(QUIET=True).get_structure('target', target_pdb_path)[0]
hydrogens = []
for atom in pdb.get_atoms():
if atom.parent.id[0].startswith('H_'):
print(atom.parent.id , atom.name)
# The atom is an hydrogen and is an HETATM record
if 'H' in atom.name:
print('ok')
hydrogens.append(atom)
print('\n\nhydrogens : \n ', hydrogens)
# Rename hydrogens of the ligand
for h_num, h in enumerate(hydrogens, 1):
# this is working, but the change is not present in the output pdb structure
print(h.id, h.full_id , type(h))
par = h.parent
par.detach_child(h.id)
print(par ,' ... ', h.parent)
"""
https://biopython.org/docs/latest/api/Bio.PDB.Atom.html?highlight=atom#module-Bio.PDB.Atom
Bio.PDB.Atom module
Atom class, used in Structure objects
class Bio.PDB.Atom.Atom(name, coord, bfactor, occupancy, altloc, fullname, serial_number, element=None, pqr_charge=None, radius=None)
__init__(name, coord, bfactor, occupancy, altloc, fullname, serial_number, element=None, pqr_charge=None, radius=None)
"""
h_new = Atom(f'X{h_num}' , h.get_coord() , h.get_bfactor() , h.get_occupancy() , h.get_altloc() , f'X{h_num}' ,
h.get_serial_number() , 'H' , h.get_charge() , h.get_radius())
par.add(h_new)
print(h_new.id ,h_new.name , h_new.full_id)
print(h_new.id ,h_new.name , h_new.full_id[4][0])
print(par , h_new.parent , '\n\n')
io.set_structure(pdb)
io.save('test.pdb', preserve_atom_numbering = False)
I get as output :
('H_GTP', 180, ' ') PA
('H_GTP', 180, ' ') PB
('H_GTP', 180, ' ') PG
('H_GTP', 180, ' ') C5'
('H_GTP', 180, ' ') O5'
('H_GTP', 180, ' ') C4'
('H_GTP', 180, ' ') O4'
('H_GTP', 180, ' ') C3'
('H_GTP', 180, ' ') O3'
('H_GTP', 180, ' ') C2'
('H_GTP', 180, ' ') O2'
('H_GTP', 180, ' ') C1'
('H_GTP', 180, ' ') N1
('H_GTP', 180, ' ') O1A
('H_GTP', 180, ' ') O1B
('H_GTP', 180, ' ') O1G
('H_GTP', 180, ' ') C2
('H_GTP', 180, ' ') N2
('H_GTP', 180, ' ') O2A
('H_GTP', 180, ' ') O2B
('H_GTP', 180, ' ') O2G
('H_GTP', 180, ' ') N3
('H_GTP', 180, ' ') O3A
('H_GTP', 180, ' ') O3B
('H_GTP', 180, ' ') O3G
('H_GTP', 180, ' ') C4
('H_GTP', 180, ' ') C5
('H_GTP', 180, ' ') C6
('H_GTP', 180, ' ') O6
('H_GTP', 180, ' ') N7
('H_GTP', 180, ' ') C8
('H_GTP', 180, ' ') N9
('H_GTP', 180, ' ') H1
ok
('H_GTP', 180, ' ') H2
ok
('H_GTP', 180, ' ') H3
ok
('H_GTP', 180, ' ') H4
ok
('H_GTP', 180, ' ') H5
ok
('H_GTP', 180, ' ') H6
ok
('H_GTP', 180, ' ') H7
ok
('H_GTP', 180, ' ') H8
ok
('H_GTP', 180, ' ') H9
ok
('H_GTP', 180, ' ') H10
ok
('H_GTP', 180, ' ') H11
ok
('H_GTP', 180, ' ') H12
ok
('H_GTP', 180, ' ') H13
ok
('H_GTP', 180, ' ') H14
ok
('H_GTP', 180, ' ') H15
ok
hydrogens :
[<Atom H1>, <Atom H2>, <Atom H3>, <Atom H4>, <Atom H5>, <Atom H6>, <Atom H7>, <Atom H8>, <Atom H9>, <Atom H10>, <Atom H11>, <Atom H12>, <Atom H13>, <Atom H14>, <Atom H15>]
H1 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H1', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X1 X1 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X1', ' '))
X1 X1 X1
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H2 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H2', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X2 X2 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X2', ' '))
X2 X2 X2
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H3 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H3', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X3 X3 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X3', ' '))
X3 X3 X3
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H4 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H4', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X4 X4 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X4', ' '))
X4 X4 X4
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H5 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H5', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X5 X5 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X5', ' '))
X5 X5 X5
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H6 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H6', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X6 X6 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X6', ' '))
X6 X6 X6
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H7 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H7', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X7 X7 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X7', ' '))
X7 X7 X7
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H8 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H8', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X8 X8 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X8', ' '))
X8 X8 X8
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H9 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H9', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X9 X9 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X9', ' '))
X9 X9 X9
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H10 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H10', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X10 X10 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X10', ' '))
X10 X10 X10
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H11 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H11', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X11 X11 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X11', ' '))
X11 X11 X11
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H12 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H12', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X12 X12 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X12', ' '))
X12 X12 X12
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H13 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H13', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X13 X13 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X13', ' '))
X13 X13 X13
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H14 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H14', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X14 X14 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X14', ' '))
X14 X14 X14
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
H15 ('target', 0, 'A', ('H_GTP', 180, ' '), ('H15', ' ')) <class 'Bio.PDB.Atom.Atom'>
<Residue GTP het=H_GTP resseq=180 icode= > ... None
X15 X15 ('target', 0, 'A', ('H_GTP', 180, ' '), ('X15', ' '))
X15 X15 X15
<Residue GTP het=H_GTP resseq=180 icode= > <Residue GTP het=H_GTP resseq=180 icode= >
with file test.pdb
:
ATOM 1 N VAL A 6 20.799 29.221 8.701 1.00 0.00 N
ATOM 2 CA VAL A 6 20.474 28.731 7.364 1.00 0.00 C
....
....
ATOM 119 CE2 PHE A 23 16.685 37.107 4.854 1.00 0.00 C
ATOM 120 CZ PHE A 23 15.341 37.392 4.850 1.00 0.00 C
HETATM 121 PA GTP A 180 26.277 33.726 -8.045 1.00 0.00 P
HETATM 122 PB GTP A 180 27.017 31.171 -6.766 1.00 0.00 P
HETATM 123 PG GTP A 180 29.710 30.132 -5.989 1.00 0.00 P
HETATM 124 C5' GTP A 180 26.615 33.475 -10.679 1.00 0.00 C
HETATM 125 O5' GTP A 180 25.804 33.834 -9.555 1.00 0.00 O
HETATM 126 C4' GTP A 180 26.219 34.288 -11.894 1.00 0.00 C
HETATM 127 O4' GTP A 180 24.826 34.017 -12.143 1.00 0.00 O
HETATM 128 C3' GTP A 180 26.372 35.802 -11.724 1.00 0.00 C
HETATM 129 O3' GTP A 180 26.880 36.347 -12.936 1.00 0.00 O
HETATM 130 C2' GTP A 180 24.932 36.243 -11.481 1.00 0.00 C
HETATM 131 O2' GTP A 180 24.719 37.581 -11.901 1.00 0.00 O
HETATM 132 C1' GTP A 180 24.069 35.240 -12.240 1.00 0.00 C
HETATM 133 N1 GTP A 180 19.000 35.036 -13.013 1.00 0.00 N
HETATM 134 O1A GTP A 180 25.089 33.867 -7.187 1.00 0.00 O
HETATM 135 O1B GTP A 180 26.072 30.050 -6.958 1.00 0.00 O
HETATM 136 O1G GTP A 180 29.197 28.937 -5.265 1.00 0.00 O
HETATM 137 C2 GTP A 180 20.022 35.339 -13.903 1.00 0.00 C
HETATM 138 N2 GTP A 180 19.627 35.619 -15.147 1.00 0.00 N
HETATM 139 O2A GTP A 180 27.427 34.635 -7.843 1.00 0.00 O
HETATM 140 O2B GTP A 180 26.960 31.913 -5.483 1.00 0.00 O
HETATM 141 O2G GTP A 180 30.881 29.816 -6.827 1.00 0.00 O
HETATM 142 N3 GTP A 180 21.301 35.367 -13.569 1.00 0.00 N
HETATM 143 O3A GTP A 180 26.807 32.212 -7.961 1.00 0.00 O
HETATM 144 O3B GTP A 180 28.517 30.631 -6.995 1.00 0.00 O
HETATM 145 O3G GTP A 180 30.013 31.278 -5.117 1.00 0.00 O
HETATM 146 C4 GTP A 180 21.489 35.054 -12.257 1.00 0.00 C
HETATM 147 C5 GTP A 180 20.554 34.737 -11.307 1.00 0.00 C
HETATM 148 C6 GTP A 180 19.183 34.712 -11.659 1.00 0.00 C
HETATM 149 O6 GTP A 180 18.205 34.448 -10.957 1.00 0.00 O
HETATM 150 N7 GTP A 180 21.168 34.483 -10.079 1.00 0.00 N
HETATM 151 C8 GTP A 180 22.443 34.655 -10.325 1.00 0.00 C
HETATM 152 N9 GTP A 180 22.724 35.005 -11.630 1.00 0.00 N
HETATM 153 X1 GTP A 180 27.642 33.664 -10.448 1.00 0.00 H
HETATM 154 X2 GTP A 180 26.472 32.436 -10.894 1.00 0.00 H
HETATM 155 X3 GTP A 180 26.872 34.003 -12.692 1.00 0.00 H
HETATM 156 X4 GTP A 180 27.038 36.109 -10.945 1.00 0.00 H
HETATM 157 X5 GTP A 180 26.303 36.091 -13.672 1.00 0.00 H
HETATM 158 X6 GTP A 180 24.683 36.247 -10.440 1.00 0.00 H
HETATM 159 X7 GTP A 180 24.926 37.660 -12.845 1.00 0.00 H
HETATM 160 X8 GTP A 180 23.874 35.594 -13.231 1.00 0.00 H
HETATM 161 X9 GTP A 180 18.670 35.593 -15.377 1.00 0.00 H
HETATM 162 X10 GTP A 180 20.293 35.851 -15.834 1.00 0.00 H
HETATM 163 X11 GTP A 180 27.124 35.555 -7.891 1.00 0.00 H
HETATM 164 X12 GTP A 180 26.059 32.241 -5.339 1.00 0.00 H
HETATM 165 X13 GTP A 180 22.030 35.588 -14.193 1.00 0.00 H
HETATM 166 X14 GTP A 180 30.718 31.035 -4.497 1.00 0.00 H
HETATM 167 X15 GTP A 180 23.174 34.539 -9.606 1.00 0.00 H
TER 168 GTP A 180
END
Upvotes: 0