R: Apply function to calculate mean of a single column of dataframe across a list

Question

Some sample data

I have three lists

    loc <- c("A","A","A","B","B","B")
    sub.loc <- c(1,2,3,1,2,3)

    set.seed(123)

    df1 <- as.data.frame(cbind(loc,sub.loc, round(rnorm(6),digits =2)))
    df2 <- as.data.frame(cbind(loc,sub.loc, round(rnorm(6),digits =2)))
    df3 <- as.data.frame(cbind(loc,sub.loc, round(rnorm(6),digits =2)))

    list.name <- list(df1,df2,df3)

I want to produce a single file that has mean and sd of the third column V3.

Something like: 

    loc    sub.loc        V3                                      v4
    A        1        mean(c(-0.56,0.46,0.4))      sd(c(-0.56,0.46,0.4)) 
    A        2        mean(c(-0.23,-1.27,0.11))    sd(c(-0.23,-1.27,0.11))
    A        3        mean(c(-0.56,-0.69, 1.56))   sd(c(-0.56,-0.69, 1.56))
    B        1       mean(c(0.07,-0.45,1.79))      sd(c(0.07,-0.45,1.79))
    B        2        mean(c(0.13,1.22,0.5))       sd(c(0.13,1.22,0.5))
    B        3        mean(c(1.72,0.36,-1.97))     sd(c(1.72,0.36,-1.97))

My actual data in column `V3`` has NAs

I thought of using lapply

    lapply(list.name, function(x) mean(x, na.rm = T))

    lapply(list.name, function(x) sd(x, na.rm = T))

But both of them give me NAs

MrFlick · Accepted Answer

This can be done with dplyr. First, I'm not sure how accurate your sample data above matches your real data but right now all your "numeric" values are factors. You really shouldn't use cbind() inside as.data.frame(), you can leave it out.

But with your example data above, we can stack the data into one larger data.frame and then do a simple group_by to get the values you want

library(dplyr)
bind_rows(list.name, .id="from") %>% 
  mutate(V3=as.numeric(as.character(V3))) %>%  # fix the factors from the sample
  group_by(loc, sub.loc) %>% 
  summarize(mean=mean(V3, na.rm=T), sd=sd(V3, na.rm=T))

R: Apply function to calculate mean of a single column of dataframe across a list

Answers (2)

Related Questions