Reputation: 330
Let's say I have two data frames df1
and df2
:
df1 = data.frame(Name = c('one','two', 'three','four'),
x=c(4,6,2,9),
y=c(45,78,44,8),
z=c(56,52,45,88))
df2 = data.frame(Name = c('one','two', 'three','four'),
x=c(5,8,3,3),
y=c(34,53,50,9),
z=c(38,96,62,83))
I am trying to output a dataframe dfout
that has a mean and standard deviation of each Name
element , i.e.:
Name = 'one'...
xmean = 4.5...
xsd = 0.5...
.
.
.
What I attempted was using bind_row()
and then using aggregate()
on the combined dataframe. Ran into some trouble, and I think there is an easier way to do that, similar to what was suggested by @thelatemail
Any input is appreciated.
Upvotes: 0
Views: 110
Reputation: 41553
You can use bind_rows
to merge your two dataframes. Use aggregate
to compute the mean
and sd
per Name
over multiple columns. You can use the following code:
library(dplyr)
aggregate(. ~ Name, bind_rows(df1, df2), function(x) c(mean = mean(x), sd = sd(x)))
Output:
Name x.mean x.sd y.mean y.sd z.mean z.sd
1 four 6.000 4.243 8.500 0.707 85.50 3.54
2 one 4.500 0.707 39.500 7.778 47.00 12.73
3 three 2.500 0.707 47.000 4.243 53.50 12.02
4 two 7.000 1.414 65.500 17.678 74.00 31.11
Reorder Name:
df3 <- aggregate(. ~ Name, bind_rows(df1, df2), function(x) c(mean = mean(x), sd = sd(x)))
reorder <- c("one", "two", "three", "four")
df3[match(reorder, df3$Name),]
Output:
Name x.mean x.sd y.mean y.sd z.mean z.sd
2 one 4.500 0.707 39.500 7.778 47.00 12.73
4 two 7.000 1.414 65.500 17.678 74.00 31.11
3 three 2.500 0.707 47.000 4.243 53.50 12.02
1 four 6.000 4.243 8.500 0.707 85.50 3.54
Upvotes: 3