user2502338
user2502338

Reputation: 899

Taking the mean of different treatment and rep counts in an R dataframe

Below is an example dataframe with different samples, treatments and reps particularly between the control and treatments recording biomass accumulation over time. I can calculate the mean biomass of each sample, treatment and reps by subsetting it or creating a (long) list object of each sample by treatment groups, then taking the mean biomass this way by calling lapply. However, is there a simpler, or better way to do this without having to "leave the dataframe", and so requires writing less code?

set.seed(34)
df <- data.frame(
    SAMPLE = rep(c("S0","S1","S2"), times = c(4,15,15)),
    TREATMENT = c("Ctl","T1","T2","T3","Ctl","Ctl","Ctl",
                  "T1","T1","T1","T1","T2","T2","T2","T2",
                  "T3","T3","T3","T3","Ctl","Ctl","Ctl","T1",
                  "T1","T1","T1","T2","T2","T2","T2","T3",
                  "T3","T3","T3"),
    REPS = c(1,1,1,1, 1,2,3,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3, 
             1,2,3,4,1,2,3,4,1,2,3,4),
    BIOMASS = round(rnorm(34, mean = 22, sd = 5), digits = 2)
)

head(df)

Thanks, Franklin

Upvotes: 1

Views: 840

Answers (2)

user2502338
user2502338

Reputation: 899

Thanks Psidom and akrun. I better understand aggregate now...To do this using the tidyverse library it would be: z <- dplyr::group_by(df, SAMPLE, TREATMENT) summarize(z, mean(BIOMASS))

Upvotes: 0

akrun
akrun

Reputation: 887231

We can use aggregate from base R

aggregate(BIOMASS~SAMPLE + TREATMENT, df, mean)

Or if is 'REPS' and 'TREATMENT' as groups

aggregate(BIOMASS~REPS + TREATMENT, df, mean)

Or with data.table

library(data.table)
setDT(df)[, .(MEAN = mean(BIOMASS)) , .(SAMPLE, TREATMENT)]

Upvotes: 3

Related Questions