Reputation: 31
I'm new in proggraming so don't be cruel to me :) I'm struggling with problem of the highest number of consecutive repetitions in a string. I'm given a substring for example "ABC", then I have a file with sequences of letters ex. "ABC ABC BBC CDA ABC ABC ABC DBA"(here spaces not included,used only for better look). Here output should be 3, this is the highest number of repetitions one after another.
I'm thinking of using str.count(sub[, start[, end]]
method, but I have no idea how to use it in order to have valid output. I've been trying to create substring s = string[i][j] and then use s2 which is string[i+len(substring):j+len(substring)]
but it seems too much cases so I gave up on it. Using code below I had valid output but only in few cases. I hope you'll help me with it. Thanks!
substr_count = 0
string = "ABCABCBBCCDAABCABCDBA"
while True:
start = 0
substring = "ABC"
loc = string.find(substring,start)
if loc == -1:
break
substr_count += 1
start = loc + len(substring)
Upvotes: 2
Views: 535
Reputation: 10799
As usr2564301 said, itertools.groupby
would be the way to go.
Here's a silly, kind of brute-force-ish way to go about it:
def max_repititions(string, substring):
if not substring:
return 0
for count in range(len(string), 0, -1):
if substring*count in string:
return count
return 0
string = "ABCABCBBCCDAABCABCDBA"
substring = "ABC"
print(max_repititions(string, substring))
Upvotes: 4
Reputation: 90
You can do this very easily with only three lines of code using regular expressions.
import re
string = "ABCABCBBCCDAABCABCDBA"
string_regex = re.compile(r'(ABC)*')
in_a_row = string_regex.search(string)
substr_count = len(str(in_a_row[0])) / len('ABC')
print(substr_count)
import re like you would any other package, define the string, put whatever you want to find in that string where the (ABC) is now and go.
This works by searching a given string, in this case named 'string' for any number of repeating (that's what the asterisk is for) strings you define in the parenthesis. Then simply take the length of in_a_row and divide it by the length of the string you asked it to find and you will be left with a numerical output of how many times it repeats.
Upvotes: 1