Reputation: 205
I need to parse through a PDB file using biopython in order to extract each line that pertains to an alpha carbon (CA). Here is the code that I use
from Bio.PDB import *
parser=PDBParser()
io = PDBIO()
structure_2 = parser.get_structure('Y', 'A.pdb')
for l in structure_2:
if atom.get_id() == 'CA':
io.set_structure(atom)
io.save("alpha.pdb")
My idea is that the for loop will go through each line of the PDB file write each line that pertains to an alpha carbon ('CA') to a new PDB file called alpha.pdb
. Here is a short preview of what structure_2
looks like:
ATOM 1 N LYS A 35 -5.054 29.359 -1.504 1.00 61.86 N
ATOM 2 CA LYS A 35 -5.430 28.077 -0.842 1.00 61.30 C
ATOM 3 C LYS A 35 -4.188 27.450 -0.230 1.00 59.47 C
ATOM 4 O LYS A 35 -3.142 27.339 -0.875 1.00 59.94 O
ATOM 5 CB LYS A 35 -6.055 27.113 -1.860 1.00 63.54 C
ATOM 6 CG LYS A 35 -7.354 26.443 -1.409 1.00 65.88 C
ATOM 7 CD LYS A 35 -7.126 25.382 -0.339 1.00 66.83 C
ATOM 8 CE LYS A 35 -8.363 24.507 -0.172 1.00 67.47 C
ATOM 9 NZ LYS A 35 -8.010 23.158 0.355 1.00 68.07 N
ATOM 10 N TYR A 36 -4.293 27.093 1.042 1.00 56.18 N
ATOM 11 CA TYR A 36 -3.183 26.472 1.741 1.00 52.61 C
ATOM 12 C TYR A 36 -3.455 24.992 1.893 1.00 51.51 C
ATOM 13 O TYR A 36 -4.561 24.580 2.250 1.00 51.93 O
ATOM 14 CB TYR A 36 -2.986 27.111 3.117 1.00 49.10 C
ATOM 15 CG TYR A 36 -2.305 28.456 3.074 1.00 45.23 C
As you can see, the relevant information (CA) is in the third column of the PDB file. Whenever I run my code, it does not write any new files, but it doesn't give me any errors. What could I be doing wrong here?
Upvotes: 3
Views: 6163
Reputation: 744
Below you can find a script that loads a protein structure 1p49.pdb (from script directory), then parses it and saves only alfa carbon atom coordinates to the 1p48_out.pdb file
#!/usr/bin/env python3
import Bio
print("Biopython v" + Bio.__version__)
from Bio.PDB import PDBParser
from Bio.PDB import PDBIO
# Parse and get basic information
parser=PDBParser()
protein_1p49 = parser.get_structure('STS', '1p49.pdb')
protein_1p49_resolution = protein_1p49.header["resolution"]
protein_1p49_keywords = protein_1p49.header["keywords"]
print("Sample name: " + str(protein_1p49))
print("Resolution: " + str(protein_1p49_resolution))
print("Keywords: " + str(protein_1p49_keywords))
print("Model: " + str(protein_1p49[0]))
#initialize IO
io=PDBIO()
#custom select
class Select():
def accept_model(self, model):
return True
def accept_chain(self, chain):
return True
def accept_residue(self, residue):
return True
def accept_atom(self, atom):
print("atom id:" + atom.get_id())
print("atom name:" + atom.get_name())
if atom.get_name() == 'CA':
print("True")
return True
else:
return False
#write to output file
io.set_structure(protein_1p49)
io.save("1p49_out.pdb", Select())
Upvotes: 3