Luke
Luke

Reputation: 25

iterative wilcox on data.frame in R

I am trying, or rather I wish I could try, to write a loop in R that executes the Wilcoxon test (wilcox.test) in an iterative way, comparing 2 groups of values in each row of a data.frame, and returning for each row the p-value that is then put in a dataframe with its associated row label. The data.frame is as follows:

> tab[1:5,]
  mol     E12     E15     E22     E25     E26     E27     E38      E44     E47
1   A 7362.40 2475.93 3886.06 5825.59 6882.00 3250.05 3406.65  6416.29 7786.73
2   B 5391.42 2037.88 3330.05 4043.83 5766.20 2591.69 3603.95 14431.89 8320.70
3   C 1195.89  241.24  252.46  865.97 1970.28  899.22  346.36  1135.86 1179.31
4   D  502.64  171.41  434.29  508.22  419.34  260.13  298.14   326.70  167.07
5   E  181.63  171.41  165.30  150.47  164.09  109.19  122.76   212.74  155.60

Column labels are: mol, the specific molecule evaluated (about 20); E12 to E47 the samples for which the value of each molecule is measured. Groups to be compared are: P; samples E12, E25, E26, E27, E44. D; samples E15, E22, E38, E47. The output should look like this:

mol p-value
A   1
B   0.5556
C   0.9048
etc.    

I tried to use a for in cycle, but I am absolutely not able to manage it in this, for me complicated, context. Any help with comments on the meaning of the instructions for a newbie like me is much appreciated.

Upvotes: 1

Views: 146

Answers (1)

AkselA
AkselA

Reputation: 8846

apply() works like a looper on matrices and arrays. In this case, with margin=1 it loops along the rows. Each row, temporarily converted into a vector x, is passed on to function(x) wilcox.test(x[P], x[D])$p.value, the result being one p-value per row. P and D are logical vectors specifying which elements within x should be used in each sample.

tab0 <- read.table(text="mol E12 E15 E22 E25 E26 E27 E38 E44 E47
   A 7362.40 2475.93 3886.06 5825.59 6882.00 3250.05 3406.65  6416.29 7786.73
   B 5391.42 2037.88 3330.05 4043.83 5766.20 2591.69 3603.95 14431.89 8320.70
   C 1195.89  241.24  252.46  865.97 1970.28  899.22  346.36  1135.86 1179.31
   D  502.64  171.41  434.29  508.22  419.34  260.13  298.14   326.70  167.07
   E  181.63  171.41  165.30  150.47  164.09  109.19  122.76   212.74  155.60",
   header=TRUE)

tab <- as.matrix(tab0[,-1])

P <- colnames(tab) %in% c("E12", "E25", "E26", "E27", "E44")
D <- colnames(tab) %in% c("E15", "E22", "E38", "E47")

pv <- apply(tab, 1, function(x) wilcox.test(x[P], x[D])$p.value)

data.frame(tab0[1], p.val=signif(pv, 4))

#   mol  p.val
# 1   A 0.5556
# 2   B 0.4127
# 3   C 0.1111
# 4   D 0.1905
# 5   E 0.9048

Upvotes: 1

Related Questions