Reputation: 9
and thanks in advance.
my problem is the following: I wanna analyse a dataframe (list) consisting of only e.g. "x" and "y". only if "x" is given in three consecutive indizes I want to get a statement that gives me the index of the third time when value = x, not the fourth or n time and then it should repeat this loop for the whole list, giving me the indizes for all the times when "x" occured in three consecutive indizes
> 0 = y
1 = x
2 = y
3 = x
4 = x
5 = x
6 = x
7 = y
8 = x
9 = x
10 = x
and so on
desired result
print (i)
- 5 , 10
Upvotes: 0
Views: 80
Reputation: 24281
A basic way to do it is to count the target values we see in a row, and to keep the indices when we have the exact number of values we expect:
def find_nth(data, target, n):
out = []
targets_in_a_row = 0
for index, value in enumerate(data):
if value != target:
targets_in_a_row = 0
else:
targets_in_a_row += 1
if targets_in_a_row == n:
out.append(index)
return out
data = ['y', 'x', 'y', 'x', 'x', 'x', 'x', 'y', 'x', 'x', 'x']
print(find_nth(data, 'x', 3))
# [5, 10]
Another way (easily adaptable to find a more complicated pattern but less efficient in this case) would be to use a collection.deque with a max length of n
to keep the last n
values we've seen. We can then easily check if all of them are equal to the target.
We just need a flag (matched
) that we set once we have n
target values in a row and reset only when we get a different one.
from collections import deque
def find_nth(data, target, n):
d = deque(maxlen = n)
out = []
matched = False
for index, value in enumerate(data):
d.append(value)
if value != target:
matched = False
elif not matched and all(val == target for val in d):
out.append(index)
matched = True
return out
data = ['y', 'x', 'y', 'x', 'x', 'x', 'x', 'y', 'x', 'x', 'x']
print(find_nth(data, 'x', 3))
# [5, 10]
Upvotes: 1
Reputation: 9
An easier way to implement the use case is:
def third_occ(self):
"""
First find all the occurrence of x in data -> all_occ_x
iterate through the occurrence index and check if they are consecutive by findind the diff between each indx
append index value to third_occ if second pair of difference is 1
:return: list : third_occ
"""
# element index for reference
# ex = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15']
data = ['x', 'x', 'x', 'x', 'y', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'y', 'x', 'x', 'x']
all_occ_x = [i for i, x in enumerate(data) if x == "x"] # all occurrence of x in data list
count = 0
third_occ = []
for n1, n2 in zip(all_occ_x[:-1], all_occ_x[1:]):
if n2 - n1 == 1 and n1 not in third_occ:
count += 1
if count == 2:
third_occ.append(n2)
count = 0
else:
count = 0
return third_occ
Upvotes: 0