Reputation: 201
I am not sure if the title of this question is appropriate, so anyone is welcome to edit it. Thank you!
My question is that if I have a string of a protein sequence:
seq='MIGQFGL'
How can I convert it to something like this:
MET 1
ILE 2
GLY 3
GLN 4
PHE 5
GLY 6
LEU 7
This is what I have tried:
f= open("protein.seq", "w")
seq = 'MIGQFGL'
d = {'C': 'CYS', 'D': 'ASP', 'S': 'SER', 'Q': 'GLN', 'K': 'LYS',
'I': 'ILE', 'P': 'PRO', 'T': 'THR', 'F': 'PHE', 'N': 'ASN',
'G': 'GLY', 'H': 'HIS', 'L': 'LEU', 'R': 'ARG', 'W': 'TRP',
'A': 'ALA', 'V':'VAL', 'E': 'GLU', 'Y': 'TYR', 'M': 'MET'}
sp = list(seq)
rep = '\n'.join(d.get(e,e) for e in sp) #to replace the items in list 'sp' with corresponding dictionary values
no = list(range(1,8))
n = '\n'.join(str(x) for x in no)
line = "{}\t{}\n".format(rep,n)
f.write(line)
But this is what I got:
MET
ILE
GLY
GLN
PHE
GLY
LEU 1
2
3
4
5
6
7
So, I changed this line:
line = "{}\t{}\n".format(rep,n)
to:
line = "{}\t{}\n".format(zip(rep,n))
But I got:
Traceback (most recent call last):
File "protein.py", line 15, in <module>
line = "{}\t{}\n".format(zip(rep,n))
IndexError: tuple index out of range
What am I doing wrong? Thanks in advance!
NB: I use Python 3.
Upvotes: 1
Views: 75
Reputation: 3801
Using enumerate
will get you what you want
rep = '\n'.join('{0} {1}'.format(d.get(s,s), i+1) for i, s in enumerate(seq))
Also, it is best practice to use with
when file handling as it both safer and neater. I.e.
with open('proteins.seq', 'w') as f:
f.write(rep)
Upvotes: 3
Reputation: 6748
Try this small fix of your code:
f= open("protein.seq", "w")
seq = 'MIGQFGL'
d = {'C': 'CYS', 'D': 'ASP', 'S': 'SER', 'Q': 'GLN', 'K': 'LYS',
'I': 'ILE', 'P': 'PRO', 'T': 'THR', 'F': 'PHE', 'N': 'ASN',
'G': 'GLY', 'H': 'HIS', 'L': 'LEU', 'R': 'ARG', 'W': 'TRP',
'A': 'ALA', 'V':'VAL', 'E': 'GLU', 'Y': 'TYR', 'M': 'MET'}
sp = list(seq)
rep = [d.get(e,e) for e in sp] #to replace the items in list 'sp' with corresponding dictionary values
no = list(range(1,8))
n = [str(x) for x in no]
line = '\n'.join([e[0]+" "+e[1] for e in zip(rep,n)])
f.write(line)
Upvotes: 2
Reputation: 430
Hope you're finding something like this one!!
[{i+1,j} for i,j in enumerate(list(seq))]
Results:
[set([1, 'M']), set(['I', 2]), set([3, 'G']), set(['Q', 4]), set([5, 'F']), set([6, 'G']), set(['L', 7])]
Upvotes: 1
Reputation: 44838
You're very close:
result = '\n'.join(f"{d.get(e,e)} {i}" for i, e in enumerate(seq, 1))
enumerate(seq, i)
is an iterator that yields values in the form ((i, seq[0]), (i + 1, seq[0 + 1]), ...)
Upvotes: 2