Reputation: 905
I have two data.frames that look like:
DF1
Col1 Col2 Col3 Col4
0.1854 0.1660 0.1997 0.4632
0.1760 0.1336 0.1985 0.4496
0.1737 0.1316 0.1943 0.4446
0.1660 0.1300 0.1896 0.4439
DF2
Col1 Col2 Col3 Col4
0.2456 0.2107 0.2688 0.5079
0.2399 0.1952 0.2356 0.1143
0.2375 0.1947 0.2187 0.0846
0.2368 0.1922 0.2087 0.1247
I would like to perform wilcox.test between the two data.frames and specifically between paired columns, so that:
test1: between Col1 of DF1 and Col1 of DF2
test2: between Col2 of DF1 and Col2 of DF2
and so on.
I used the following script:
for (i in 1:length(DF2)){
test <- apply(DF1, 2, function(x) wilcox.test(x, as.numeric(DF2[[i]]), correct=TRUE))
}
Unfortunately the output of this script is different respect to the output of the same test performed using the following script:
test1 = wilcox.test(DF1[,1], DF2[,1], correct=FALSE)
test2 = wilcox.test(DF1[,2], DF2[,2], correct=FALSE)
Since in the real data.frames I have around 100 columns and 200 rows (they are equal respect to the dimension) I cannot make the test columns by columns.
After dput(DF1)
:
structure(list(Col1 = c(0.1854, 0.1760, 0.1737, 0.1660,....), class = "data.frame", row.names = c(NA, -100L)))
The same for DF2
Upvotes: 3
Views: 109
Reputation: 5566
It might be easier to loop over the column names instead with your for loop
for (name in colnames(DF2)){
...
wilcox.test(DF1[,name], DF2[,name], correct=FALSE))
...
}
Upvotes: 1
Reputation: 60462
This is a classic mapply
case - basically just a multivariate version of sapply
. We use mapply
to go through each data frame in turn. First, create some data:
df1 = data.frame(c1 = runif(10), c2 = runif(10), c3 = runif(10), c4 = runif(10))
df2 = data.frame(c1 = runif(10), c2 = runif(10), c3 = runif(10), c4 = runif(10))
Then use mapply
l = mapply(wilcox.test, df1, df2, SIMPLIFY=FALSE, correct=FALSE)
Here the variable l
is a list. So,
wilcox.test(df1[,1], df2[,1], correct=FALSE)
l[[1]]
wilcox.test(df1[,2], df2[,2], correct=FALSE)
l[[2]]
Upvotes: 6