R - Cleanest way to run statistical test on every permutation of multiple populations

Question

I have three populations stored as individual vectors. I need to run a statistical test (wilcoxon, if it matters) on each pair of these three populations.

I want to input three vectors into some block of code and get as output a vector of 6 p-values (one p-value is the result of one test and is a double).

I have a method that works but I am new to R and from what I've been reading I feel like there should be a better way, possibly involving storing the vectors as a data frame and using vectorization, to write this code.

Here is the code I have:

library(arrangements)

runAllTests <- function(pop1,pop2,pop3) {
    populations <- list(pop1=pop1,pop2=pop2,pop3=pop3)
    colLabels <- c("pop1", "pop2", "pop3")

    #This line makes a data frame where each column is a pair of labels
    perms <- data.frame(t(permutations(colLabels,2)))

    pvals <- vector()

    #This for loop gets each column of that data frame
    for (pair in perms[,]) {
        pair <- as.vector(pair)
        p1 <- as.numeric(unlist(populations[pair[1]]))
        p2 <- as.numeric(unlist(populations[pair[2]]))

        pvals <- append(pvals, wilcox.test(p1, p2,alternative=c("less"))$p.value)
    }

    return(pvals)
}

What is a more R appropriate way to write this code?

Note: Generating populations and comparing them all to each other is a common enough thing (and tricky enough to code) that I think this question will apply to more people than myself.

EDIT: I forgot that my actual populations are of different sizes. This means I cannot make a data frame out of the vectors (as far as I know). I can make a list of vectors though. I have updated my code with a version that works.

lroha · Accepted Answer

Here is an example of one approach that uses combn() which has a function argument that can be used to easily apply wilcox.test() to all variable combinations.

set.seed(234)

# Create dummy data
df <- data.frame(replicate(3, sample(1:5, 100, replace = TRUE)))

# Apply wilcox.test to all combinations of variables in data frame.
res <- combn(names(df), 2, function(x) list(data = c(paste(x[1], x[2])), p = wilcox.test(x = df[[x[1]]], y =  df[[x[2]]])$p.value), simplify = FALSE)

# Bind results
do.call(rbind, res) 

     data    p         
[1,] "X1 X2" 0.45282   
[2,] "X1 X3" 0.06095539
[3,] "X2 X3" 0.3162251

R - Cleanest way to run statistical test on every permutation of multiple populations

Answers (2)

Related Questions