Geli
Geli

Reputation: 75

How to run paired wilcoxon test for multiple variables using a for loop

I am trying to run a paired Wilcoxon test for multiple variables using a for loop. However, it gives me an error regarding the formula (x~y). I have tried different ways to define x, y but without success.

I am attaching a subset of Data and the code with the error messages.

Visit       Var1     Var2    Var3
BSL                 24.378  23.045
BSL         9.602   10.08   21.624
BSL         9.01    0       10.858
BSL        4.524    17.86   9
BSL         3.75    8.656   22.575
BSL                         15.83
BSL        6.596    5.34    16.956
BSL        7.065    17.801  16.505
BSL        6.877    3.408   
BSL       15.651            31.983
LV                 18.226   21.009
LV         2.225    6.605   14.191
LV         7.417    15.61   
LV          1.42             1.392
LV                 15.965   22.149
LV                           6.701
LV         2.752    24.364  
LV         6.504    7.371   27.116
LV         7.594    14.391  13.875
LV         6.652            21.985



# 1st test
for (i in (2:ncol(Data_pairs))) {  
        group <- Data_pairs[,1]
        
         result <- wilcox_test(data=Data_pairs, Data_pairs[,i]~group, paired = TRUE)
         result
}     

#        Error: Can't extract columns that don't exist.
#        x Column `group` doesn't exist.



## If I use the following codes for the wilcoxon test, the above loop gives me an error again:

# 2nd test
      result <- wilcox_test(data=Data_pairs, Data_pairs[,i]~Visit, paired = TRUE)
      result
#       Error: Can't extract columns that don't exist.
#       x Column `Data_pairs[, i]` doesn't exist.
   
     

# 3rd test (using wilcox.test function)
        result <- wilcox.test(data=Data_pairs, Data_pairs[,i]~group, paired = TRUE)
        result
#       Error in wilcox.test.default(x = c(9.602, 9.01, 4.524, 3.75,  : 
#       'x' and 'y' must have the same length
> dput(Data_pairs)

structure(list(Visit = c("BSL", "BSL", "BSL", "BSL", "BSL", "BSL", 
"BSL", "BSL", "BSL", "BSL", "LV", "LV", "LV", "LV", "LV", "LV", 
"LV", "LV", "LV", "LV"), Var1 = c(NA, 9.602, 9.01, 4.524, 3.75, 
NA, 6.596, 7.065, 6.877, 15.651, NA, 2.225, 7.417, 1.42, NA, 
NA, 2.752, 6.504, 7.594, 6.652), Var2 = c(24.378, 10.08, 0, 17.86, 
8.656, NA, 5.34, 17.801, 3.408, NA, 18.226, 6.605, 15.61, NA, 
15.965, NA, 24.364, 7.371, 14.391, NA), Var3 = c(23.045, 21.624, 
10.858, 9, 22.575, 15.83, 16.956, 16.505, NA, 31.983, 21.009, 
14.191, NA, 1.392, 22.149, 6.701, NA, 27.116, 13.875, 21.985)), class = "data.frame", row.names = c(NA, 
-20L))

Is there any suggestion/advice on how to correct this?

Thank you!

Upvotes: 3

Views: 2416

Answers (1)

mastropi
mastropi

Reputation: 1416

It seems the wilcox_test() function only accepts colum names as part of the formula (which is not the case for instance with the lm() function, where your notation for specifying the formula would have worked).

As I don't have the rstatix package --where apparently the wilcox_test() function you are using is defined (https://www.rdocumentation.org/packages/rstatix/versions/0.7.0/topics/wilcox_test)--, I can just suggest that you construct the formula from the column names as follows:

cols = colnames(Data_pairs)
for (i in (2:ncol(Data_pairs))) {
        formula = as.formula( paste(cols[i], cols[1], sep="~") )
        result <- wilcox_test(data=Data_pairs, formula=formula, paired=TRUE)
        result
}

Regarding the wilcox.test() function you also tried using: this function does not accept a formula as its signature is of the form wilcox.test(x, y, ...), where x and y are the analysis variables. In addition, both variables x and y must be numeric, they cannot be factors as is the case with the group variable in the wilcox_test() function. (Ref: https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/wilcox.test)

Upvotes: 1

Related Questions