Reputation: 51
I have written this function which does not function as I would like it. Any ideas please? I understand the problem is somehow on the char definition...
def count_nucleotides(dna, nucleotide):
''' (str, str) -> int
Return the number of occurrences of nucleotide in the DNA sequence dna.
>>> count_nucleotides('ATCGGC', 'G')
2
>>> count_nucleotides('ATCTA', 'G')
0
'''
num_nucleodites=0
for char in dna:
if char is ('A'or'T'or'C'or'G'):
num_nucleodites=num_nucleodites + 1
return num_nucleodites
Upvotes: 0
Views: 11033
Reputation: 2640
string = "ATAGTTCATGTACCGTTGCAGGGGG"
print [(i,string.count(i)) for i in list("ACTG")]
Upvotes: 0
Reputation: 213223
The or
condition that you have written does not work the way you think.. It should check char
with each of them, and then use or
to separate them, something like this: -
if char is 'A' or char is 'T' ... so on:
But, you can use a better way using in
operator.
Try this: -
for char in dna:
if char in 'ATCG':
num_nucleodites=num_nucleodites + 1
But since you said in your question
that you want count of a specific element only, you don't need to check each char with all 4 of them.. Here's how your code should look like using basic for-loop: -
def count_nucleotides(dna, nucleotide):
num_nucleotide = 0
for char in dna:
if char == nucleotide:
num_nucleotide = num_nucletode + 1
return num_nucleotide
Or just: -
def count_nucleotides(dna, nucleotide):
return dna.count(nucleotide)
Upvotes: 0
Reputation: 336138
It seems you're looking for overlapping sequences of more than one nucleotide, according to one of your comments. This can be done with a regular expression:
import re
def find_overlapping(needle, haystack):
return len(re.findall("(?=" + needle + ")", haystack))
You can then use it like this:
>>> find_overlapping("AGA", "AGAGAGAAGAGAG")
5
Upvotes: 2
Reputation: 169563
if char is ('A'or'T'or'C'or'G'):
That is evaluating 'A' or 'T'
which returns 'A', then checking if char
is equal to it - try it in the Python REPL:
>>> ('A'or'T'or'C'or'G')
'A'
I think what you meant to do was:
if char in ('A', 'T', 'C', 'G'):
Upvotes: 0
Reputation: 179412
What about just
def count_nucleotides(dna, nucleotide):
return dna.count(nucleotide)
(Mind you, that's probably not going to fly as far as homework is concerned...)
Upvotes: 3