Reputation: 3591
I am trying to apply a complex function on multiple columns after applying a group on it.
Code example is:
library(dplyr)
data(iris)
add = function(x,y) {
z = x+y
return(mean(z))
}
iris %>%
group_by(Species) %>%
summarise_at(.vars=c('Sepal.Length', 'Sepal.Width'),
.funs = add('Sepal.Length', 'Sepal.Width' ) )
I was expecting that the function would be applied to each group and returned as a new column but I get:
Error in x + y : non-numeric argument to binary operator
How can I get this work?
Note my real problem has a much more complicated function than the simple add
function I've written here that requires the two columns be fed in as separate entities I can't just sum them first.
Thanks
Upvotes: 4
Views: 6561
Reputation: 1
summarize()
already allows you to summarize multiple columns.
example:
summarize(mean_xvalues = mean(x) , sd_yvalues = sd(y), median_zvalues = median(z))
where x,y,z are columns of a dataframe.
Upvotes: 0
Reputation: 2496
Don't think you need summarise_at
, since your definition of add
takes care fo the multiple input arguments. summarise_at
is useful when you are applying the same change to multiple columns, not for combining them.
If you just want sum of the columns, you can try:
iris %>%
group_by(Species) %>%
summarise_at(
.vars= vars( Sepal.Length, Sepal.Width),
.funs = sum)
which gives:
Species Sepal.Length Sepal.Width
<fctr> <dbl> <dbl>
1 setosa 250 171
2 versicolor 297 138
3 virginica 329 149
in case you want to add the columns together, you can just do:
iris %>%
group_by(Species) %>%
summarise( k = sum(Sepal.Length, Sepal.Width))
which gives:
Species k
<fctr> <dbl>
1 setosa 422
2 versicolor 435
3 virginica 478
using this form with your definition of add
add = function(x,y) {
z = x+y
return(mean(z))
}
iris %>%
group_by(Species) %>%
summarise( k = add(Sepal.Length, Sepal.Width))
returns
Species k
<fctr> <dbl>
1 setosa 8
2 versicolor 9
3 virginica 10
Upvotes: 3