Reputation: 37
Currently:
def detect_repet(s):
return_list=[]
split_text= s.split('\n')
print(split_text)
for x in split_text:
print(x)
return
print(detect_repet('Well, sheep says beeeee and\ncat says miaaaaaaaw\nand cow would shout mooooooow'))
I am struggling with the detection of atleast two identical chracter in a row in a string: I've tried this but indexes overflow in later iterations:
my_string= "danieeeeel"
for i in range(len(my_string)):
if(my_string[i]==my_string[i+1]==my_string[i+2]):
print('YES')
else:
print('NO')
The ideal ouput of the detect_repet would be ['beeeee', 'miaaaaaaaw','mooooooow']
Upvotes: 0
Views: 70
Reputation: 5802
There's nothing to add to @rdas's answer in terms of solving your specific problem. Still, on a didactical note, I'd like to point to groupby
from the builtin itertools
module. Whenever you need to group a sequence of elements [a, a, a, b, b, b]
into chunks [[a, a, a], [b, b, b]
, the first thing that comes to mind is groupby
.
It generates (label, subsequence)
-tuples you can iterate over. Since subsequence
is a generator, you have to turn it into a list in order to calculate the length. With that in mind, another approach to your problem could be something like:
from itertools import groupby
def detect_repet(s):
for group in groupby(s):
if len(list(group[1])) > 2:
return True
return False
This can be made even more concise and efficient, but it illustrates the idea.
You'd use it like this:
>>> text = 'Well, sheep says beeeee and\ncat says miaaaaaaaw\nand cow would shout mooooooow'
>>> [word for word in text.split() if detect_repeat(word)]
['beeeee', 'miaaaaaaaw', 'mooooooow']
Upvotes: 0
Reputation: 1826
You can check if the list line[i:i+3]
is composed of equal values using set
s running for a loop n-2
times because of the index [i:i+3]
would be out of bounds:
def detect_repeat(string):
retval = set()
for line in string.split():
for i in range(len(line) - 2):
if len(set(line[i:i+3])) == 1:
retval.add(line)
return retval
or, with a set comprehension:
detect_repeat = lambda s:{line for i in range(len(line)-2) for line in s.split() if len(set(line[i:i+3])) == 1}
Output in either way:
{'miaaaaaaaw', 'beeeee', 'mooooooow'}
Upvotes: 0
Reputation: 21285
Your inner loop need to run till len(my_string) - 2
to account for the index i+2
which needs to be less than len(my_string)
in the end.
You should also use a set
to avoid duplicate results from longer runs of the same char:
def detect_repet(string):
retval = set()
for line in string.split():
for i in range(len(line) - 2):
if line[i] == line[i + 1] == line[i + 2]:
retval.add(line)
return retval
print(detect_repet('Well, sheep says beeeee and\ncat says miaaaaaaaw\nand cow would shout mooooooow'))
Result:
{'beeeee', 'miaaaaaaaw', 'mooooooow'}
Upvotes: 1
Reputation: 521457
I would use re.findall
with the regex pattern \b(\w*(\w)\2\w*)\b
:
inp = "Well, sheep says beeeee and\ncat says miaaaaaaaw\nand cow would shout mooooooow"
matches = [x[0] for x in re.findall(r'\b(\w*(\w)\2\w*)\b', inp)]
print(matches) # ['Well', 'sheep', 'beeeee', 'miaaaaaaaw', 'mooooooow']
Note that your sample input string actually turned up two other words which repeat the same letter 2 or more times: Well
and sheep
.
Upvotes: 2