Reputation: 47
I am a bit confused with the definition of the permutation_test function provided by scipy. the following is what I wrote to calculate the p and null distribution
x = disease['Theta']
y = nodisease['Theta']
def ranksum_statistic(x, y, axis):
stat, _ = ranksums(x, y, axis = axis, alternative='greater')
return stat
permutation_test(x, y, ranksum_statistic, permutation_type='independent',
vectorized=None, n_resamples=9999, batch=None,
alternative='greater', axis=0, random_state=None)
this will return a p-value drastically different (-log10p = 3)
with directly using ranksum test (-log10p = 150)
(also, if I run ranksum test multiple times, p-value also changes quite a lot)
I am confused as to which one is more appropriate if I want to compare two different-sized arrays (non-parametric) that came from the same distribution (values are different) since no-disease should already be the null distribution.
Upvotes: 0
Views: 45
Reputation: 3873
In the following, note that ranksums
performs essentially the same test as mannwhitneyu
. The difference is that ranksums
reports a z-score of the statistic rather than the raw integral statistic, and it is incapable of performing a continuity correction and calculating exact p-values as mannwhitneyu
can, depending on your parameters. So I'd always suggest using mannwhitneyu
, but as requested, I'll answer in terms of ranksums
, only using mannwhitneyu
as a reference.
Your code is essentially correct. To check, we compare the p-value given by permutation_test
against the p-value provided by mannwhitneyu
with method='exact'
.
import numpy as np
from scipy import stats
rng = np.random.default_rng(32590345094235)
x = rng.random(6)
y = rng.random(7)
def ranksum_statistic(x, y, axis):
stat, _ = stats.ranksums(x, y, axis = axis, alternative='greater')
return stat
res = stats.permutation_test((x, y), ranksum_statistic, permutation_type='independent',
n_resamples=9999, alternative='greater', axis=0,
random_state=rng)
ref = stats.mannwhitneyu(x, y, alternative='greater', method='exact')
np.allclose(res.pvalue, ref.pvalue, rtol=1e-14)
Note that the only meaningful change I made to your code was to pass (x, y)
as a tuple rather than as separate arguments. (permutation_test
does not accept separate samples as separate arguments; they must be packed in a sequence.)
For these small samples of sizes 6 and 7, we get an exact p-value because n_resamples=9999
is sufficient to perform all binom(6 + 7, 6) = binom(6 + 7, 7) = 1716
permutations. The minimum possible p-value we can get in this case is therefore 1/1716
.
If n_resamples
is insufficient to perform an exact test, a randomized test is performed, in which case the smallest p-value you can get is 1 / (n_resamples + 1)
. This occurs when the observed value of the statistic is more extreme than that of every other permutation resample. If n_resamples=9999
, that means the smallest p-value you can get from permutation_test
is -log10(p) = 3
. Yes, the p-value will be stochastic. If you need the stochastic p-value to be reproducible, pass a seeded NumPy Generator
explicitly as I have (random_state=rng
). Note that in the long run (i.e. many separate data sets), these randomized p-values still control the false-positive rate almost exactly, even if they are only an approximation of the exact p-value for given data. For more information, see Permutation p-values should never be zero.
ranksums
always uses an analytical approximation of the null distribution, so it can return smaller p-values. This might be more accurate, but only if your sample sizes are "sufficiently large" such that the asymptotic approximation is valid. To assess whether the tiny p-value you're getting is reasonable, note that the largest possible statistic of the Mann Whitney U test is m*n
, where m
and n
are the sample sizes. If that is the statistic for your observed data, then the exact p-value will be m * n / binom(m + n, m)
. If ranksums
is returning a smaller p-value than that, it is too small.
But again, use mannwhitneyu
with method='exact'
if you don't have ties and your samples are sufficiently small so that it returns in reasonable time. Note that the exact p-value calculation in SciPy 1.14 is much more efficient than previous versions, so if you can't get an exact p-value in an older version, try upgrading. If you do have ties and/or if your samples are larger than mannwhitneyu
with method='exact'
can handle, please clarify.
Upvotes: 0