Josh
Josh

Reputation: 105

group and average replicates from a data frame

I have a data frame of different samples and technical replicates (AA.1, AA.2, AA.3). Each full sample set (all sample technical replicates) has a measurement, var3, and is repeated for a different var2 (X, Y, or Z). So in total, I have (# of samples)(# of technical replicates)(number of var2) measurements (all possible combinations of var1 x var2 repeated 3 times).

data.frame(
  var1=rep(rep(c('AA.1', 'AA.2', 'AA.3', 'BB.1', 'BB.2', 'BB.3'), each=3), 2),
  var2=rep(c('X', 'Y'), each=18),
  var3=sample(20:40, 36, replace=TRUE)
)

For each var2, I want to average each individual sample's technical replicate. I would like to do this by creating a new data frame that lists the sample name as the row names and 3 columns are the 3 technical replicates. Then I can do rowMeans() and sd(). How is this possible?

Upvotes: 1

Views: 11332

Answers (2)

jlhoward
jlhoward

Reputation: 59375

In base R (calling your data frame df):

aggregate(var3~var1+var2,df,mean)
#    var1 var2     var3
# 1  AA.1    X 31.66667
# 2  AA.2    X 25.00000
# 3  AA.3    X 30.66667
# 4  BB.1    X 27.33333
# 5  BB.2    X 32.00000
# 6  BB.3    X 29.66667
# 7  AA.1    Y 32.33333
# 8  AA.2    Y 24.66667
# 9  AA.3    Y 26.66667
# 10 BB.1    Y 38.00000
# 11 BB.2    Y 30.33333
# 12 BB.3    Y 25.66667

Upvotes: 7

Mark T Patterson
Mark T Patterson

Reputation: 417

There are several ways of doing this. I think using dplyr is probably the most straightforward, but you could also use the tapply command. It's a little hard for me to figure out from your question which variables you want to group by, but hopefully running the following code will help make things clear --

Assuming you want to find the mean of var3, grouped by both var1 and var2, enter the following:

library(dplyr)

dat %>% group_by(var2,var1) %>% 
summarize(var3.mean = mean(var3))

like I said, it's a bit hard for me to tell whether this is the grouping structure you want.. the code above will give you the mean of var3 for each unique combination of var1 and var2.

Upvotes: 2

Related Questions