Reputation: 679
I'm trying to compute a mean on my data but I'm struggling with 2 things: 1. getting the right layout and 2. including the missing values in the outcome.
#My input data:
Stock <- c("A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B")
Soil <- c("Blank", "Blank", "Control", "Control", "Clay", "Clay", "Blank", "Blank", "Control", "Control", "Clay", "Clay")
Nitrogen <- c(NA, NA, 0, 0, 20, 20, NA, NA, 0, 0, 20, 20)
Respiration <- c(112, 113, 124, 126, 139, 137, 109, 111, 122, 124, 134, 136)
d <- as.data.frame(cbind(Stock, Soil, Nitrogen, Respiration))
#The outcome I'd like to get:
Stockr <- c("A", "A", "A", "B", "B", "B")
Soilr <- c("Blank", "Control", "Clay", "Blank", "Control", "Clay")
Nitrogenr <- c(NA, 0, 20, NA, 0, 20)
Respirationr <- c(111, 125, 138, 110, 123, 135)
result <- as.data.frame(cbind(Stockr, Soilr, Nitrogenr, Respirationr))
Many thanks in advance for your help!
Upvotes: 3
Views: 3473
Reputation: 2526
One more, you can use data.table
:
require(data.table)
d1 = data.table(d)
sapply(colnames(d1)[3:4],function(x) d1[[x]] <<- as.numeric(d1[[x]]))
d1[,list("AVG_Nitro"=mean(Nitrogen,na.rm=T),"AVG_Resp"=mean(Respiration,na.rm=T)),by="Stock,Soil"]
Stock Soil AVG_Nitro AVG_Resp
1: A Blank NaN 112.5
2: A Control 0 125.0
3: A Clay 20 138.0
4: B Blank NaN 110.0
5: B Control 0 123.0
6: B Clay 20 135.0
Upvotes: 0
Reputation: 81693
Here's a solution with ddply
from the plyr
package:
library(plyr)
ddply(d, .(Stock, Soil, Nitrogen), summarise,
Respiration = mean(as.numeric(as.character(Respiration))))
# Stock Soil Nitrogen Respiration
# 1 A Blank <NA> 112.5
# 2 A Clay 20 138.0
# 3 A Control 0 125.0
# 4 B Blank <NA> 110.0
# 5 B Clay 20 135.0
# 6 B Control 0 123.0
Please note that cbind
is not a good way to create a data frame. You should use data.frame(Stock, Soil, Nitrogen, Respiration)
instead. Due to your approach, all columns of d
are factors. I used as.numeric(as.character(Respiration))
to obtain the numeric values of this column.
Upvotes: 1
Reputation: 25736
You could use a combination of aggregate
and merge
:
d <- data.frame(Stock=Stock, Soil=Soil,
Nitrogen=Nitrogen, Respiration=Respiration)
## aggregate values; don't remove NAs (na.action=NULL)
nitrogen <- aggregate(Nitrogen ~ Stock + Soil, data=d, FUN=mean, na.action=NULL)
respiration <- aggregate(Respiration ~ Stock + Soil, data=d, FUN=mean)
## merge results
merge(nitrogen, respiration)
# Stock Soil Nitrogen Respiration
#1 A Blank NA 112.5
#2 A Clay 20 138.0
#3 A Control 0 125.0
#4 B Blank NA 110.0
#5 B Clay 20 135.0
#6 B Control 0 123.0
Upvotes: 1