user2773611
user2773611

Reputation: 107

Complementary DNA sequence

I'm having a problem writing this loop; it seems to stop after the second sequence.

I want to return the complementary DNA sequence to the given DNA sequence.

E.g. ('AGATTC') -> ('TCTAAG'), where A:T and C:G

def get_complementary_sequence(dna):
    """(str) -> str

> Return the DNA sequence that is complementary to the given DNA sequence

    >>> get_complementary_sequence('AT')
    ('TA')
    >>> get_complementary_sequence('AGATTC')
    ('TCTAAG')

    """

    x = 0
    complementary_sequence = ''

    for char in dna:
            complementary_sequence = (get_complement(dna))

    return complementary_sequence + (dna[x:x+1])

Can anyone spot why the loop does not continue?

Upvotes: 3

Views: 2740

Answers (3)

Stefan Gruenwald
Stefan Gruenwald

Reputation: 2640

Here is an example how I would do it - only several lines of code really:

from string import maketrans

DNA="CCAGCTTATCGGGGTACCTAAATACAGAGATAT" #example DNA fragment

def complement(sequence):
  reverse = sequence[::-1]
  return reverse.translate(maketrans('ATCG','TAGC'))

print complement(DNA)

Upvotes: 3

vroomfondel
vroomfondel

Reputation: 3106

You call get_complement on all of dna instead of each char. This will simply call the came function with the same parameters len(dna) times. There's no reason to loop through the chars if you never use them. If get_complement() can take a char, I would recommend:

for char in dna:
    complementary_sequence += get_complement(char)

The implementation of get_complement would take a single character and return its complement.

Also, you're returning complementary_sequence + (dna[x:x+1]). If you want the function to conform to the behavior that you've documented, the + (dna[x:x+1]) will add an extra (wrong) character from the beginning off the dna string. All you need to return is complementary_sequence! Thanks to @Kevin for noticing.

What you're doing:

>>> dna = "1234"
>>> for char in dna:
...     print dna
... 
1234
1234
1234
1234

what I think is closer to what you want to be doing:

>>> for char in dna:
...     print char
... 
1
2
3
4

Putting it all together:

# you could also use a list comprehension, with a join() call, but
# this is closer to your original implementation.
def get_complementary_sequence(seq):
    complement = ''
    for char in seq:
        complement += get_complement(char)
    return complement

def get_complement(base):
    complements = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
    return complements[base]

>>> get_complementary_sequence('AT')
'TA'
>>> get_complementary_sequence('AGATTC')
'TCTAAG'

Upvotes: 2

johnsyweb
johnsyweb

Reputation: 141790

What's wrong?

You're calling:

complementary_sequence = (get_complement(dna))

...n times where n is the length of the string. This leaves you with whatever the return value of get_complement(dna) is in complementary_sequence. Presumably just one letter.

You then return this one letter (complementary_sequence) followed by the substring dna[0:1] (i.e. the first letter in dna), because x is always 0.

This would be why you always get two characters returned.

How to fix it?

Assuming you have a function like:

def get_complement(d):
    return {'T': 'A', 'A': 'T', 'C': 'G', 'G': 'C'}.get(d, d)

...you could fix your function by simply using str.join() and a list comprehension:

def get_complementary_sequence(dna):
    """(str) -> str

> Return the DNA sequence that is complementary to the given DNA sequence

    >>> get_complementary_sequence('AT')
    ('TA')
    >>> get_complementary_sequence('AGATTC')
    ('TCTAAG')

    """

    return ''.join([get_complement(c) for c in dna])

Upvotes: 2

Related Questions