Reputation: 365
I have a function that calculates the probability of belonging to category k ~ {1, 2, ..., K} based on eta
and the cutoff points, c
between the categories.
import numpy as np
import scipy.special as ss
def pmf(K: int, eta: np.ndarray, c: np.ndarray) -> np.array:
"""
Example
-------
>>> K = 5
>>> p = np.array([[0.1, 0.3, 0.2, 0.35, 0.05]])
>>> cum_p = np.cumsum(p)
>>> cum_logits = ss.logit(cum_p[:-1])
>>> eta = np.zeros((1, 1))
>>> p_K = pmf(K=K, eta=eta, c=cum_logits)
>>> print(p_K)
[[0.1 0.3 0.2 0.35 0.05]]
"""
p = np.zeros((eta.shape[1], K))
for k in range(K):
if k == 0:
p[:,k] = 1 - ss.expit(eta - c[0])
elif k == K - 1:
p[:,k] = ss.expit(eta - c[-1])
else:
p[:,k] = ss.expit(eta - c[k - 1]) - ss.expit(eta - c[k])
return p
Is it possible to remove the for-loop? For the boundaries its easily done as follows:
p[:,0] = 1 - ss.expit(eta - c[:,0])
p[:,-1] = ss.expit(eta - c[:,-1])
But how to "deal" with the other values of k
?
Upvotes: 0
Views: 43
Reputation: 9146
You should be able to do something along these lines:
p = np.empty((eta.shape[1], K))
p[:,0] = 1 - ss.expit(eta - c[0])
k = np.arange(1, K-1)
p[:,k] = ss.expit(eta - c[k - 1]) - ss.expit(eta - c[k])
p[:,-1] = ss.expit(eta - c[-1])
Note: I changed p
to be initialized as np.empty
since that's faster for larger arrays.
Upvotes: 1