Reputation: 75
I am trying to run a paired Wilcoxon test for multiple variables using a for loop. However, it gives me an error regarding the formula (x~y). I have tried different ways to define x, y but without success.
I am attaching a subset of Data and the code with the error messages.
Visit Var1 Var2 Var3
BSL 24.378 23.045
BSL 9.602 10.08 21.624
BSL 9.01 0 10.858
BSL 4.524 17.86 9
BSL 3.75 8.656 22.575
BSL 15.83
BSL 6.596 5.34 16.956
BSL 7.065 17.801 16.505
BSL 6.877 3.408
BSL 15.651 31.983
LV 18.226 21.009
LV 2.225 6.605 14.191
LV 7.417 15.61
LV 1.42 1.392
LV 15.965 22.149
LV 6.701
LV 2.752 24.364
LV 6.504 7.371 27.116
LV 7.594 14.391 13.875
LV 6.652 21.985
# 1st test
for (i in (2:ncol(Data_pairs))) {
group <- Data_pairs[,1]
result <- wilcox_test(data=Data_pairs, Data_pairs[,i]~group, paired = TRUE)
result
}
# Error: Can't extract columns that don't exist.
# x Column `group` doesn't exist.
## If I use the following codes for the wilcoxon test, the above loop gives me an error again:
# 2nd test
result <- wilcox_test(data=Data_pairs, Data_pairs[,i]~Visit, paired = TRUE)
result
# Error: Can't extract columns that don't exist.
# x Column `Data_pairs[, i]` doesn't exist.
# 3rd test (using wilcox.test function)
result <- wilcox.test(data=Data_pairs, Data_pairs[,i]~group, paired = TRUE)
result
# Error in wilcox.test.default(x = c(9.602, 9.01, 4.524, 3.75, :
# 'x' and 'y' must have the same length
> dput(Data_pairs)
structure(list(Visit = c("BSL", "BSL", "BSL", "BSL", "BSL", "BSL",
"BSL", "BSL", "BSL", "BSL", "LV", "LV", "LV", "LV", "LV", "LV",
"LV", "LV", "LV", "LV"), Var1 = c(NA, 9.602, 9.01, 4.524, 3.75,
NA, 6.596, 7.065, 6.877, 15.651, NA, 2.225, 7.417, 1.42, NA,
NA, 2.752, 6.504, 7.594, 6.652), Var2 = c(24.378, 10.08, 0, 17.86,
8.656, NA, 5.34, 17.801, 3.408, NA, 18.226, 6.605, 15.61, NA,
15.965, NA, 24.364, 7.371, 14.391, NA), Var3 = c(23.045, 21.624,
10.858, 9, 22.575, 15.83, 16.956, 16.505, NA, 31.983, 21.009,
14.191, NA, 1.392, 22.149, 6.701, NA, 27.116, 13.875, 21.985)), class = "data.frame", row.names = c(NA,
-20L))
Is there any suggestion/advice on how to correct this?
Thank you!
Upvotes: 3
Views: 2416
Reputation: 1416
It seems the wilcox_test()
function only accepts colum names as part of the formula (which is not the case for instance with the lm()
function, where your notation for specifying the formula would have worked).
As I don't have the rstatix
package --where apparently the wilcox_test()
function you are using is defined (https://www.rdocumentation.org/packages/rstatix/versions/0.7.0/topics/wilcox_test)--, I can just suggest that you construct the formula from the column names as follows:
cols = colnames(Data_pairs)
for (i in (2:ncol(Data_pairs))) {
formula = as.formula( paste(cols[i], cols[1], sep="~") )
result <- wilcox_test(data=Data_pairs, formula=formula, paired=TRUE)
result
}
Regarding the wilcox.test()
function you also tried using: this function does not accept a formula as its signature is of the form wilcox.test(x, y, ...)
, where x
and y
are the analysis variables. In addition, both variables x
and y
must be numeric, they cannot be factors as is the case with the group
variable in the wilcox_test()
function.
(Ref: https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/wilcox.test)
Upvotes: 1