ana_gg
ana_gg

Reputation: 370

Make a loop over groups and store the results of loop in a table in R

I have a large dataset with accuracies, as an example I have:

acc
           V1 V2
1  0.65996025 B1
2  0.55217749 B1
3  0.78412743 B1
4  0.95358681 B1
5  0.23634827 B2
6  0.35234372 B2
7  0.21214891 B2
8  0.03710918 B2
9  0.84751145 B3
10 0.89086948 B3
11 0.59060242 B3
12 0.68724963 B3

I made sub groups

B1 = acc[acc$V2 == "B1",]
B2 = acc[acc$V2 == "B2",]
B3 = acc[acc$V2 == "B3",]

I want to have the difference between each group like:

diff_1_2 = B1$V1 - B2$V1
diff_1_3 = B1$V1 - B3$V1
diff_2_3 = B2$V1 - B3$V1

I want to use it to calculate p-values using the following equation:

t.value = (mean(diff_1_2)) / (sd(diff_1_2)
p.value = 2*pt(-abs(t.value), df=nrow(diff_1_2)-1)
sig<-ifelse(as.numeric(mean(p.value))<0.05,"sig","no")

As you can see this is very inefficient. So the question is how to do it in a loop and at the end I would like to have a table like for example

        results
B1_B2   sig
B1_B3   sig
B2_B3   sig

Any ideas?? Thank you in advance

Upvotes: 1

Views: 90

Answers (1)

ktiu
ktiu

Reputation: 2626

Using your (properly formatted) data:

acc <- tibble::tribble(
    ~V1,        ~V2,
    0.65996025, "B1",
    0.55217749, "B1",
    0.78412743, "B1",
    0.95358681, "B1",
    0.23634827, "B2",
    0.35234372, "B2",
    0.21214891, "B2",
    0.03710918, "B2",
    0.84751145, "B3",
    0.89086948, "B3",
    0.59060242, "B3",
    0.68724963, "B3"
)

You can split it like so:

split <- split(acc, ~V2)

You could then define your test function (after some debugging):

your_test <- function(values) {
  t.value <- mean(values) / sd(values)
  p.value <- 2 * pt(-abs(t.value), df = length(values) - 1)
  ifelse(mean(p.value) < 0.05, "sig", "no")
}

And plug it into a purrr-style mapping/reducing:

library(purrr)

unique(acc$V2) %>%
  combn(2, simplify = F) %>%
  set_names(map(., paste, collapse = "_")) %>%
  map(~ split[[.x[1]]]$V1 - split[[.x[2]]]$V1) %>%
  imap(~ data.frame(results = your_test(.x), row.names = .y)) %>%
  reduce(rbind)

Returning:

      results
B1_B2      no
B1_B3      no
B2_B3     sig

Upvotes: 1

Related Questions