Reputation: 107
I'm having a problem writing this loop; it seems to stop after the second sequence.
I want to return the complementary DNA sequence to the given DNA sequence.
E.g. ('AGATTC') -> ('TCTAAG'), where A:T and C:G
def get_complementary_sequence(dna):
"""(str) -> str
> Return the DNA sequence that is complementary to the given DNA sequence
>>> get_complementary_sequence('AT')
('TA')
>>> get_complementary_sequence('AGATTC')
('TCTAAG')
"""
x = 0
complementary_sequence = ''
for char in dna:
complementary_sequence = (get_complement(dna))
return complementary_sequence + (dna[x:x+1])
Can anyone spot why the loop does not continue?
Upvotes: 3
Views: 2740
Reputation: 2640
Here is an example how I would do it - only several lines of code really:
from string import maketrans
DNA="CCAGCTTATCGGGGTACCTAAATACAGAGATAT" #example DNA fragment
def complement(sequence):
reverse = sequence[::-1]
return reverse.translate(maketrans('ATCG','TAGC'))
print complement(DNA)
Upvotes: 3
Reputation: 3106
You call get_complement
on all of dna
instead of each char
. This will simply call the came function with the same parameters len(dna) times. There's no reason to loop through the char
s if you never use them. If get_complement()
can take a char, I would recommend:
for char in dna:
complementary_sequence += get_complement(char)
The implementation of get_complement would take a single character and return its complement.
Also, you're returning complementary_sequence + (dna[x:x+1])
. If you want the function to conform to the behavior that you've documented, the + (dna[x:x+1])
will add an extra (wrong) character from the beginning off the dna
string. All you need to return is complementary_sequence
! Thanks to @Kevin for noticing.
What you're doing:
>>> dna = "1234"
>>> for char in dna:
... print dna
...
1234
1234
1234
1234
what I think is closer to what you want to be doing:
>>> for char in dna:
... print char
...
1
2
3
4
Putting it all together:
# you could also use a list comprehension, with a join() call, but
# this is closer to your original implementation.
def get_complementary_sequence(seq):
complement = ''
for char in seq:
complement += get_complement(char)
return complement
def get_complement(base):
complements = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
return complements[base]
>>> get_complementary_sequence('AT')
'TA'
>>> get_complementary_sequence('AGATTC')
'TCTAAG'
Upvotes: 2
Reputation: 141790
You're calling:
complementary_sequence = (get_complement(dna))
...n times where n is the length of the string. This leaves you with whatever the return value of get_complement(dna)
is in complementary_sequence
. Presumably just one letter.
You then return this one letter (complementary_sequence
) followed by the substring dna[0:1]
(i.e. the first letter in dna
), because x
is always 0
.
This would be why you always get two characters returned.
Assuming you have a function like:
def get_complement(d):
return {'T': 'A', 'A': 'T', 'C': 'G', 'G': 'C'}.get(d, d)
...you could fix your function by simply using str.join()
and a list comprehension:
def get_complementary_sequence(dna):
"""(str) -> str
> Return the DNA sequence that is complementary to the given DNA sequence
>>> get_complementary_sequence('AT')
('TA')
>>> get_complementary_sequence('AGATTC')
('TCTAAG')
"""
return ''.join([get_complement(c) for c in dna])
Upvotes: 2