user1719345
user1719345

Reputation: 51

Counting nucleotides in DNA sequence string

I have written this function which does not function as I would like it. Any ideas please? I understand the problem is somehow on the char definition...

def count_nucleotides(dna, nucleotide):
    ''' (str, str) -> int

    Return the number of occurrences of nucleotide in the DNA sequence dna.

    >>> count_nucleotides('ATCGGC', 'G')
    2
    >>> count_nucleotides('ATCTA', 'G')
    0
    '''

    num_nucleodites=0

    for char in dna:
        if char is ('A'or'T'or'C'or'G'):
            num_nucleodites=num_nucleodites + 1      
    return num_nucleodites

Upvotes: 0

Views: 11033

Answers (5)

Stefan Gruenwald
Stefan Gruenwald

Reputation: 2640

string = "ATAGTTCATGTACCGTTGCAGGGGG"
print [(i,string.count(i)) for i in list("ACTG")]

Upvotes: 0

Rohit Jain
Rohit Jain

Reputation: 213223

The or condition that you have written does not work the way you think.. It should check char with each of them, and then use or to separate them, something like this: -

if char is 'A' or char is 'T' ... so on:

But, you can use a better way using in operator.

Try this: -

for char in dna:
        if char in 'ATCG':
            num_nucleodites=num_nucleodites + 1 

But since you said in your question that you want count of a specific element only, you don't need to check each char with all 4 of them.. Here's how your code should look like using basic for-loop: -

def count_nucleotides(dna, nucleotide):
    num_nucleotide = 0
    for char in dna:
        if char == nucleotide:
            num_nucleotide = num_nucletode + 1
    return num_nucleotide

Or just: -

def count_nucleotides(dna, nucleotide):
    return dna.count(nucleotide)

Upvotes: 0

Tim Pietzcker
Tim Pietzcker

Reputation: 336138

It seems you're looking for overlapping sequences of more than one nucleotide, according to one of your comments. This can be done with a regular expression:

import re
def find_overlapping(needle, haystack):
    return len(re.findall("(?=" + needle + ")", haystack))

You can then use it like this:

>>> find_overlapping("AGA", "AGAGAGAAGAGAG")
5

Upvotes: 2

dbr
dbr

Reputation: 169563

if char is ('A'or'T'or'C'or'G'):

That is evaluating 'A' or 'T' which returns 'A', then checking if char is equal to it - try it in the Python REPL:

>>> ('A'or'T'or'C'or'G')
'A'

I think what you meant to do was:

if char in ('A', 'T', 'C', 'G'):

Upvotes: 0

nneonneo
nneonneo

Reputation: 179412

What about just

def count_nucleotides(dna, nucleotide):
    return dna.count(nucleotide)

(Mind you, that's probably not going to fly as far as homework is concerned...)

Upvotes: 3

Related Questions