Reputation: 25
I am trying, or rather I wish I could try, to write a loop in R that executes the Wilcoxon test (wilcox.test) in an iterative way, comparing 2 groups of values in each row of a data.frame, and returning for each row the p-value that is then put in a dataframe with its associated row label. The data.frame is as follows:
> tab[1:5,]
mol E12 E15 E22 E25 E26 E27 E38 E44 E47
1 A 7362.40 2475.93 3886.06 5825.59 6882.00 3250.05 3406.65 6416.29 7786.73
2 B 5391.42 2037.88 3330.05 4043.83 5766.20 2591.69 3603.95 14431.89 8320.70
3 C 1195.89 241.24 252.46 865.97 1970.28 899.22 346.36 1135.86 1179.31
4 D 502.64 171.41 434.29 508.22 419.34 260.13 298.14 326.70 167.07
5 E 181.63 171.41 165.30 150.47 164.09 109.19 122.76 212.74 155.60
Column labels are: mol, the specific molecule evaluated (about 20); E12 to E47 the samples for which the value of each molecule is measured. Groups to be compared are: P; samples E12, E25, E26, E27, E44. D; samples E15, E22, E38, E47. The output should look like this:
mol p-value
A 1
B 0.5556
C 0.9048
etc.
I tried to use a for in cycle, but I am absolutely not able to manage it in this, for me complicated, context. Any help with comments on the meaning of the instructions for a newbie like me is much appreciated.
Upvotes: 1
Views: 146
Reputation: 8846
apply()
works like a looper on matrices and arrays. In this case, with margin=1
it loops along the rows. Each row, temporarily converted into a vector x
, is passed on to function(x) wilcox.test(x[P], x[D])$p.value
, the result being one p-value per row. P
and D
are logical vectors specifying which elements within x
should be used in each sample.
tab0 <- read.table(text="mol E12 E15 E22 E25 E26 E27 E38 E44 E47
A 7362.40 2475.93 3886.06 5825.59 6882.00 3250.05 3406.65 6416.29 7786.73
B 5391.42 2037.88 3330.05 4043.83 5766.20 2591.69 3603.95 14431.89 8320.70
C 1195.89 241.24 252.46 865.97 1970.28 899.22 346.36 1135.86 1179.31
D 502.64 171.41 434.29 508.22 419.34 260.13 298.14 326.70 167.07
E 181.63 171.41 165.30 150.47 164.09 109.19 122.76 212.74 155.60",
header=TRUE)
tab <- as.matrix(tab0[,-1])
P <- colnames(tab) %in% c("E12", "E25", "E26", "E27", "E44")
D <- colnames(tab) %in% c("E15", "E22", "E38", "E47")
pv <- apply(tab, 1, function(x) wilcox.test(x[P], x[D])$p.value)
data.frame(tab0[1], p.val=signif(pv, 4))
# mol p.val
# 1 A 0.5556
# 2 B 0.4127
# 3 C 0.1111
# 4 D 0.1905
# 5 E 0.9048
Upvotes: 1