Optimize identification of quantiles along array columns

Question

I have an array A (of size m x n), and a percentage p in [0,1]. I need to produce an m x n boolean array B, with True in in the (i,j) entry if A[i,j] is in p^{th} quantile of the column A[:,j].

Here is the code I have used so far.

import numpy as np

m = 200
n = 300

A = np.random.rand(m, n)

p = 0.3

quant_levels = np.zeros(n)
 
for i in range(n):
    quant_levels[i] = np.quantile(A[:,i],p)
    
B = np.array(A >= quant_levels)

Bill · Accepted Answer

I'm not sure it's much faster but you should at least be aware that numpy.quantile has an axis keyword argument so you can compute all the quantiles with one command:

quant_levels = np.quantile(A, p, axis=0)
B = (A >= quant_levels)

Optimize identification of quantiles along array columns

Answers (2)

Related Questions