jack kelly
jack kelly

Reputation: 330

Mean and standard deviation across element of multiple dataframes

Let's say I have two data frames df1 and df2:

df1 = data.frame(Name = c('one','two', 'three','four'), 
                 x=c(4,6,2,9),
                 y=c(45,78,44,8),
                 z=c(56,52,45,88))

df2 = data.frame(Name = c('one','two', 'three','four'), 
                 x=c(5,8,3,3),
                 y=c(34,53,50,9),
                 z=c(38,96,62,83))

I am trying to output a dataframe dfout that has a mean and standard deviation of each Name element , i.e.:

Name = 'one'...
xmean = 4.5...
xsd = 0.5...
.
.
.

What I attempted was using bind_row() and then using aggregate() on the combined dataframe. Ran into some trouble, and I think there is an easier way to do that, similar to what was suggested by @thelatemail

Any input is appreciated.

Upvotes: 0

Views: 110

Answers (1)

Quinten
Quinten

Reputation: 41553

You can use bind_rows to merge your two dataframes. Use aggregate to compute the mean and sd per Name over multiple columns. You can use the following code:

library(dplyr)
aggregate(. ~ Name, bind_rows(df1, df2), function(x) c(mean = mean(x), sd = sd(x)))

Output:

   Name x.mean  x.sd y.mean   y.sd z.mean  z.sd
1  four  6.000 4.243  8.500  0.707  85.50  3.54
2   one  4.500 0.707 39.500  7.778  47.00 12.73
3 three  2.500 0.707 47.000  4.243  53.50 12.02
4   two  7.000 1.414 65.500 17.678  74.00 31.11

Edit

Reorder Name:

df3 <- aggregate(. ~ Name, bind_rows(df1, df2), function(x) c(mean = mean(x), sd = sd(x)))
reorder <- c("one", "two", "three", "four")
df3[match(reorder, df3$Name),]

Output:

   Name x.mean  x.sd y.mean   y.sd z.mean  z.sd
2   one  4.500 0.707 39.500  7.778  47.00 12.73
4   two  7.000 1.414 65.500 17.678  74.00 31.11
3 three  2.500 0.707 47.000  4.243  53.50 12.02
1  four  6.000 4.243  8.500  0.707  85.50  3.54

Upvotes: 3

Related Questions