Reputation: 4279
I need to reshape and summarize a data frame. I've identified reshape2 (and perhaps dplyr) as the most likely package to do the job, but the only approach I've come up with is so inefficient and tedious that it doesn't bear presenting here. Here's an example data set; the real one has more variables to summarize with more aggregate functions:
> d <- data.frame(type=sample(c("dog","goat", "pika"), 50, replace=TRUE), a=rnorm(50, 50,8), b=rnorm(50,70,4))
> head(d)
type a b
1 dog 49.29015 73.09723
2 dog 35.16051 72.44976
3 dog 58.37524 66.41876
4 goat 66.05670 64.05190
5 goat 51.45586 69.84018
6 goat 63.10084 69.70595
I'm trying to get it into a shape like this:
type variable mean sd
1 dog a 50 8
2 dog b 70 4
3 goat a 50 8
4 goat b 70 4
5 pika a 50 8
6 pika b 70 4
Upvotes: 1
Views: 699
Reputation: 887881
You could use dplyr
with tidyr
library(dplyr)
library(tidyr)
d %>%
gather(variable, val, a:b) %>%
group_by(type, variable) %>%
summarise(Mean=mean(val, na.rm=TRUE), Sd=sd(val, na.rm=TRUE))
gives the result (it is different because the example didn't used set.seed
# type variable Mean Sd
#1 dog a 45.72271 7.304119
#2 dog b 72.16658 5.562985
#3 goat a 48.10097 6.856664
#4 goat b 70.16296 4.014350
#5 pika a 52.88040 6.434812
#6 pika b 68.70830 4.343295
Upvotes: 4
Reputation: 206546
This uses reshape2
and dplyr
library(reshape2)
library(dplyr)
summarize(group_by(melt(d), type, variable), mean=mean(value), sd=sd(value))
# Source: local data frame [6 x 4]
# Groups: type
# type variable mean sd
# 1 dog a 47.42249 10.669676
# 2 dog b 68.92475 3.659657
# 3 goat a 52.41433 7.181254
# 4 goat b 70.28015 3.815483
# 5 pika a 51.78442 8.513349
# 6 pika b 71.10006 4.445932
Upvotes: 4