Reputation: 175
Assuming I have n = 3
lists of same length for example:
R1 = [7,5,8,6,0,6,7]
R2 = [8,0,2,2,0,2,2]
R3 = [1,7,5,9,0,9,9]
I need to find the first index t
that verifies the n = 3 following conditions for a period p = 2
.
Edit: the meaning of period p
is the number of consecutive "boxes".
R1[t] >= 5, R1[t+1] >= 5
. Here t +p -1 = t+1
, we need to only verify for two boxes t
and t+1
. If p
was equal to 3
we will need to verify for t
, t+1
and t+2
. Note that It's always the same number for which we test, we always test if it's greater than 5
for every index. The condition is always the same for all the "boxes".R2[t] >= 2, R2[t+1] >= 2
R3[t] >= 9, R3[t+1] >= 9
In total there is 3 * p conditions.
Here the t
I am looking for is 5
(indexing is starting from 0).
The basic way to do this is by looping on all the indexes using a for
loop. If the condition is found for some index t
we store it in some local variable temp
and we verify the conditions still hold for every element whose index is between t+1
and t+p -1
. If while checking, we find an index that does not satisfy a condition, we forget about the temp
and we keep going.
What is the most efficient way to do this in Python if I have large lists (like of 10000 elements)? Is there a more efficient way than the for loop?
Upvotes: 0
Views: 228
Reputation: 3542
Since all your conditions are the same (>=
), we could leverage this.
This solution will work for any number of conditions and any size of analysis window, and no for loop is used.
You have an array:
>>> R = np.array([R1, R2, R3]).T
>>> R
array([[7, 8, 1],
[5, 0, 7],
[8, 2, 5],
[6, 2, 9],
[0, 0, 0],
[6, 2, 9],
[7, 2, 9]]
and you have thresholds:
>>> thresholds = [5, 2, 9]
So you can check where the conditions are met:
>>> R >= thresholds
array([[ True, True, False],
[ True, False, False],
[ True, True, False],
[ True, True, True],
[False, False, False],
[ True, True, True],
[ True, True, True]])
And where they all met at the same time:
>>> R_cond = np.all(R >= thresholds, axis=1)
>>> R_cond
array([False, False, False, True, False, True, True])
From there, you want the conditions to be met for a given window.
We'll use the fact that booleans can sum together, and convolution to apply the window:
>>> win_size = 2
>>> R_conv = np.convolve(R_cond, np.ones(win_size), mode="valid")
>>> R_conv
array([0., 0., 1., 1., 1., 2.])
The resulting array will have values equal to win_size
at the indices where all conditions are met on the window range.
So let's retrieve the first of those indices:
>>> index = np.where(R_conv == win_size)[0][0]
>>> index
5
If such an index doesn't exist, it will raise an IndexError
, I'm letting you handle that.
So, as a one-liner function, it gives:
def idx_conditions(arr, thresholds, win_size, condition):
return np.where(
np.convolve(
np.all(condition(arr, thresholds), axis=1),
np.ones(win_size),
mode="valid"
)
== win_size
)[0][0]
I added the condition as an argument to the function, to be more general.
>>> from operator import ge
>>> idx_conditions(R, thresholds, win_size, ge)
5
Upvotes: 2
Reputation: 308
Here is a solution using numpy ,without for loops:
import numpy as np
R1 = np.array([7,5,8,6,0,6,7])
R2 = np.array([8,0,2,2,0,2,2])
R3 = np.array([1,7,5,9,0,9,9])
a = np.logical_and(np.logical_and(R1>=5,R2>=2),R3>=9)
np.where(np.logical_and(a[:-1],a[1:]))[0].item()
ouput
5
Edit:
Generalization
Say you have a list of lists R
and a list of conditions c
:
R = [[7,5,8,6,0,6,7],
[8,0,2,2,0,2,2],
[1,7,5,9,0,9,9]]
c = [5,2,9]
First we convert them to numpy arrays. the reshape(-1,1) converts c
to a column matrix so that we can use pythons broadcasting feature in the >=
operator
R = np.array(R)
c = np.array(c).reshape(-1,1)
R>=c
output:
array([[ True, True, True, True, False, True, True],
[ True, False, True, True, False, True, True],
[False, False, False, True, False, True, True]])
then we perform logical & operation between all rows using reduce
function
a = np.logical_and.reduce(R>=c)
a
output:
array([False, False, False, True, False, True, True])
next we create two arrays by removing first and last element of a
and perform a logical & between them which shows which two subsequent elements satisfied the conditions in all lists:
np.logical_and(a[:-1],a[1:])
output:
array([False, False, False, False, False, True])
now np.where just shows the index of the True
element
np.where(np.logical_and(a[:-1],a[1:]))[0].item()
output:
5
Upvotes: 1
Reputation: 6483
This could be a way:
R1 = [7,5,8,6,0,6,7]
R2 = [8,0,2,2,0,2,2]
R3 = [1,7,5,9,0,9,9]
for i,inext in zip(range(len(R1)),range(len(R1))[1:]):
if (R1[i]>=5 and R1[inext]>=5)&(R2[i]>=2 and R2[inext]>=2)&(R3[i]>=9 and R3[inext]>=9):
print(i)
Output:
5
Edit: Generalization could be:
def foo(ls,conditions):
index=0
for i,inext in zip(range(len(R1)),range(len(R1))[1:]):
if all((ls[j][i]>=conditions[j] and ls[j][inext]>=conditions[j]) for j in range(len(ls))):
index=i
return index
R1 = [7,5,8,6,0,6,7]
R2 = [8,0,2,2,0,2,2]
R3 = [1,7,5,9,0,9,9]
R4 = [1,7,5,9,0,1,1]
R5 = [1,7,5,9,0,3,3]
conditions=[5,2,9,1,3]
ls=[R1,R2,R3,R4,R5]
print(foo(ls,conditions))
Output:
5
And, maybe if the arrays match the conditions multiple times, you could return a list of the indexes:
def foo(ls,conditions):
index=[]
for i,inext in zip(range(len(R1)),range(len(R1))[1:]):
if all((ls[j][i]>=conditions[j] and ls[j][inext]>=conditions[j]) for j in range(len(ls))):
print(i)
index.append(i)
return index
R1 = [6,7,8,6,0,6,7]
R2 = [2,2,2,2,0,2,2]
R3 = [9,9,5,9,0,9,9]
R4 = [1,1,5,9,0,1,1]
R5 = [3,3,5,9,0,3,3]
conditions=[5,2,9,1,3]
ls=[R1,R2,R3,R4,R5]
print(foo(ls,conditions))
Output:
[0,5]
Upvotes: 1