Inês Diniz
Inês Diniz

Reputation: 3

how to make t-student test in loop in r?

I'm new in R and what I want to do is something very simple but I need help.

img

I have a database that looks like the one above; where spot number = "name" of a protein, grupo = group I and II and APF = fluorescent reading. I want to do a tstudent test to each protein, by comparing groups I and II, but in a loop.

In the database above there only 1 protein (147) but im my real database i have 444 proteins.

Upvotes: 0

Views: 265

Answers (2)

Molx
Molx

Reputation: 6921

Starting with some fake data:

set.seed(0)
Spot.number <- rep(147:149, each=10)
grupo <- rep(rep(1:2, each=5), 3)
APF <- rnorm(30)
gel <- data.frame(Spot.number, grupo, APF)

> head(gel)
  Spot.number grupo        APF
1         147     1  2.1780699
2         147     1 -0.2609347
3         147     1 -1.6125236
4         147     1  1.7863384
5         147     1  2.0325473
6         147     2  0.6261739

You can use lapply to loop through the subsets of gel, split by the Spot.number:

tests <- lapply(split(gel, gel$Spot.number), function(spot) t.test(APF ~ grupo, spot))

or just

tests <- by(gel, gel$Spot.number, function(spot) t.test(APF ~ grupo, spot))

You can then move on to e.g. taking only the p values:

sapply(tests, "[[", "p.value")

#      147       148       149 
#0.2941609 0.9723856 0.5726007 

or confidence interval

sapply(tests, "[[", "conf.int")
#           147       148        149
# [1,] -0.985218 -1.033815 -0.8748502
# [2,]  2.712395  1.066340  1.4240488

And the resulting vector or matrix will already have the Spot.number as names which can be very helpful.

Upvotes: 1

David Robinson
David Robinson

Reputation: 78590

You can perform a t.test within each group using dplyr and my broom package. If your data is stored in a data frame called dat, you would do:

library(dplyr)
library(broom)

results <- dat %>%
    group_by(Spot.number) %>%
    do(tidy(t.test(APF ~ grupo, .)))

This works by performing t.test(APF ~ grupo, .) on each group defined by Spot.number. The tidy function from broom then turns it into a one-row data frame so that it can be recombined. The results data frame will then contain one row per protein (Spot.number) with columns including estimate, statistic, and p.value.

See this vignette for more on the combination of dplyr and broom.

Upvotes: 1

Related Questions