Comparing distributions of values to a specific value

Question

I have 1000 distributions of 459 floats between 0.0 and 1.0 stored in variable prop_list_test2

I also have 1000 values to compare each distribution to stored as p_95_null. For each distribution, I am trying to find the proportion of the distribution that is >= its p_95_null counterpart. So for the first distribution in prop_list_test2 I want to compare it against the first value in p_95_null and so on, until I have an array of 1000 proportions pv.

Here is my attempt at doing it, although it's a very messy and non-pythonic way of going about it

pv = []
index = 0

comp = p_95_null[index] #What we're comparing it to
truth_list = []

while index= comp:
            truth_list.append(True)
            test_list = []
            index+=1
        elif i < comp:
            truth_list.append(False)
            test_list = []
            index+=1

    pv.append((sum(truth_list)/len(truth_list)))


print(pv)

My output is [0.06318082788671024, 0.058823529411764705, 0.058823529411764705]. Something isn't working as I was expecting 1000 values in pv, but I only get 3. What part of my code is causing this issue, I can't seem to figure it out.

Marat · Accepted Answer

This is the pythonic way to do this:

pv = [sum(v > p_95 for v in values)/len(values) 
      for values, p_95 in zip(prop_list_test2, p_95_null)]

Explanation:

overall, this(pv = [... for ... in ...]) is a list comprehension - a syntax in Python helpful to map sequences
zip(...) pairs a list of float values with their p95 thresholds, so it's easier to iterate without messing with indexes
the left part is pretty much the same as the last line in your code. The only difference is that internal for loop is replaced with a generator, which is then passed to sum

Code review:

pv = []
index = 0

comp = p_95_null[index] #What we're comparing it to
truth_list = []

# nothing is wrong with this line, but it would be more appropriate to:
# for index, test_list in enumerate(prop_list_test2):
while index= comp:
            truth_list.append(True)
            test_list = []
            index+=1
        elif i < comp:
            truth_list.append(False)
            test_list = []
            index+=1

    pv.append((sum(truth_list)/len(truth_list)))

Comparing distributions of values to a specific value

Answers (1)

Related Questions