Reputation: 25
My function was suposed to receive a large string, go through it, and find the maximum number of times the pattern "AGATC" repeats consecutively. Regardless of what I feed this function, my return is always 1.
def agatc(s):
maxrep = 0
temp = 0
for i in range(len(s) - 4):
if s[i] == "A" and s[i + 1] == "G" and s[i + 2] == "A" and s[i + 3] == "T" and s[i + 4] == "C":
temp += 1
print(i)
i += 3
else:
if temp > maxrep:
maxrep = temp
temp = 0
return maxrep
Also tried initializing the for loop with (0, len(s) - 4, 1)
, got the same return.
I though the problem might be in adding 3 to the i
variable (apparently it wasn't), so I added print(i)
to see what was happening. I got the following:
45
1938
2049
2195
2952
2957
2962
2967
2972
2977
2982
2987
2992
2997
3002
3007
3012
3017
3022
3689
4754
Upvotes: 1
Views: 91
Reputation: 15364
In this way you can find the number of overlapping matches:
def agatc(s):
temp = 0
for i in range(len(s) - len("AGATC") + 1):
if s[i:i+len("AGATC")] == "AGATC":
temp += 1
return temp
If you want to find non-overlapping matches:
def agatc(s):
temp = 0
i = 0
while i < len(s) - len("AGATC") + 1:
if s[i:i+len("AGATC")] == "AGATC":
temp += 1
i += len("AGATC")
else:
i += 1
return temp
Upvotes: 3
Reputation: 27557
This function counts the greatest amount of consecutive 'AGATC'
s in a string and returns the amount:
def agatc(s):
w = "AGATC"
maxrep = [m.start() for m in re.finditer(w,s)] # The beginning index fror each AGATC
c = ''
for i,v in enumerate(maxrep):
if i < len(maxrep)-1:
if v+5 == maxrep[i+1]:
c+='y'
else:
c+='n'
return len(max(c.split('n')))+1
print(agatc("oooooooooAGATCooooAGATCAGATCAGATCAGATCooooooAGATCAGATC"))
Output:
4
Upvotes: 0
Reputation: 331
A simple solution with module re
import re
s = 'FGHAGATCATCFJSFAGATCAGATCFHGH'
match = re.finditer('(?P<name>AGATC)+', s)
max_len = 0
result = tuple()
for m in match:
l = m.end() - m.start()
if l > max_len:
max_len = l
result = (m.start(), m.end())
print(result)
Upvotes: 1
Reputation: 3305
Personally I would use regular expressions. But if you do not want that, you could use the str.find() method. Here is my solution:
def agatc(s):
cnt = 0
findstr='aga' # pattern you are looking for
for i in range(len(s)):
index = s.find(findstr)
if index != -1:
cnt+=1
s = s[index+1:] # overlapping matches
# s = s[index+len(findstr):] # non-overlapping matches only
print(index, s) # just to see what happens
return cnt
Upvotes: 0