Batselot
Batselot

Reputation: 303

Thresholding a python list with multiple values

Okay so I have a an array of 1000x100 with random numbers. I want to threshold this list with a list of multiple numbers; these numbers go from [3 to 9].If they are higher than the threshold I want the sum of the row appended to a list.

I have tried many ways, including a 3 times for conditional. Right now, I have found a way to compare an array to a list of numbers but each time that happens I get random numbers from that list again.

xpatient=5
sd_healthy=2
xhealthy=7
sd_patient=2
thresholdvalue1=(xpatient-sd_healthy)*10
thresholdvalue2=(((xhealthy+sd_patient))*10)
thresholdlist=[]
x1=[]
Ahealthy=np.random.randint(10,size=(1000,100))
Apatient=np.random.randint(10,size=(1000,100))
TParray=np.random.randint(10,size=(1,61))
def thresholding(A,B): 
    for i in range(A,B):
        thresholdlist.append(i)
        i+=1
thresholding(thresholdvalue1,thresholdvalue2+1)
thresholdarray=np.asarray(thresholdlist)
thedivisor=10
newthreshold=(thresholdarray/thedivisor)
for x in range(61):
    Apatient=np.random.randint(10,size=(1000,100))
    Apatient=[Apatient>=newthreshold[x]]*Apatient
    x1.append([sum(x) for x in zip(*Apatient)])

So,my for loop consists of a random integer within it, but if I don't do that, I don't get to see the threshold each turn. I want the threshold for the whole array to be 3,3.1,3.2 etc. etc. I hope I delivered my point. Thanks in advance

Upvotes: 1

Views: 1112

Answers (1)

Eduard Ilyasov
Eduard Ilyasov

Reputation: 3308

You can solve your problem using this approach:

import numpy as np

def get_sums_by_threshold(data, threshold, axis): # use axis=0 to sum values along rows, axis=1 - along columns
    result = list(np.where(data >= threshold, data, 0).sum(axis=axis))
    return result

xpatient=5
sd_healthy=2
xhealthy=7
sd_patient=2
thresholdvalue1=(xpatient-sd_healthy)*10
thresholdvalue2=(((xhealthy+sd_patient))*10)

np.random.seed(100) # to keep generated array reproducable
data = np.random.randint(10,size=(1000,100))
thresholds = [num / 10.0 for num in range(thresholdvalue1, thresholdvalue2+1)]

sums = list(map(lambda x: get_sums_by_threshold(data, x, axis=0), thresholds))

But you should know that your initial array includes only integer values and you will have same result for multiple thresholds that have the same integer part (f.e. 3.0, 3.1, 3.2, ..., 3.9). If you want to store float numbers from 0 to 9 in your initial array with the specified shape you can do following:

data = np.random.randint(90,size=(1000,100)) / 10.0

Upvotes: 2

Related Questions