HappyDuppy
HappyDuppy

Reputation: 47

using ranksum test within permutation test

I am a bit confused with the definition of the permutation_test function provided by scipy. the following is what I wrote to calculate the p and null distribution

x = disease['Theta']
y = nodisease['Theta']

def ranksum_statistic(x, y, axis):
    stat, _ = ranksums(x, y, axis = axis, alternative='greater')
    return stat

permutation_test(x, y, ranksum_statistic, permutation_type='independent', 
vectorized=None, n_resamples=9999, batch=None, 
alternative='greater', axis=0, random_state=None)

this will return a p-value drastically different (-log10p = 3) with directly using ranksum test (-log10p = 150) (also, if I run ranksum test multiple times, p-value also changes quite a lot)

I am confused as to which one is more appropriate if I want to compare two different-sized arrays (non-parametric) that came from the same distribution (values are different) since no-disease should already be the null distribution.

Upvotes: 0

Views: 45

Answers (1)

Matt Haberland
Matt Haberland

Reputation: 3873

In the following, note that ranksums performs essentially the same test as mannwhitneyu. The difference is that ranksums reports a z-score of the statistic rather than the raw integral statistic, and it is incapable of performing a continuity correction and calculating exact p-values as mannwhitneyu can, depending on your parameters. So I'd always suggest using mannwhitneyu, but as requested, I'll answer in terms of ranksums, only using mannwhitneyu as a reference.

Your code is essentially correct. To check, we compare the p-value given by permutation_test against the p-value provided by mannwhitneyu with method='exact'.

import numpy as np
from scipy import stats

rng = np.random.default_rng(32590345094235)
x = rng.random(6)
y = rng.random(7)

def ranksum_statistic(x, y, axis):
    stat, _ = stats.ranksums(x, y, axis = axis, alternative='greater')
    return stat

res = stats.permutation_test((x, y), ranksum_statistic, permutation_type='independent', 
                             n_resamples=9999, alternative='greater', axis=0,
                             random_state=rng)
ref = stats.mannwhitneyu(x, y, alternative='greater', method='exact')
np.allclose(res.pvalue, ref.pvalue, rtol=1e-14)

Note that the only meaningful change I made to your code was to pass (x, y) as a tuple rather than as separate arguments. (permutation_test does not accept separate samples as separate arguments; they must be packed in a sequence.)

For these small samples of sizes 6 and 7, we get an exact p-value because n_resamples=9999 is sufficient to perform all binom(6 + 7, 6) = binom(6 + 7, 7) = 1716 permutations. The minimum possible p-value we can get in this case is therefore 1/1716.

If n_resamples is insufficient to perform an exact test, a randomized test is performed, in which case the smallest p-value you can get is 1 / (n_resamples + 1). This occurs when the observed value of the statistic is more extreme than that of every other permutation resample. If n_resamples=9999, that means the smallest p-value you can get from permutation_test is -log10(p) = 3. Yes, the p-value will be stochastic. If you need the stochastic p-value to be reproducible, pass a seeded NumPy Generator explicitly as I have (random_state=rng). Note that in the long run (i.e. many separate data sets), these randomized p-values still control the false-positive rate almost exactly, even if they are only an approximation of the exact p-value for given data. For more information, see Permutation p-values should never be zero.

ranksums always uses an analytical approximation of the null distribution, so it can return smaller p-values. This might be more accurate, but only if your sample sizes are "sufficiently large" such that the asymptotic approximation is valid. To assess whether the tiny p-value you're getting is reasonable, note that the largest possible statistic of the Mann Whitney U test is m*n, where m and n are the sample sizes. If that is the statistic for your observed data, then the exact p-value will be m * n / binom(m + n, m). If ranksums is returning a smaller p-value than that, it is too small.

But again, use mannwhitneyu with method='exact' if you don't have ties and your samples are sufficiently small so that it returns in reasonable time. Note that the exact p-value calculation in SciPy 1.14 is much more efficient than previous versions, so if you can't get an exact p-value in an older version, try upgrading. If you do have ties and/or if your samples are larger than mannwhitneyu with method='exact' can handle, please clarify.

Upvotes: 0

Related Questions