KillerSnail
KillerSnail

Reputation: 3591

summarise_at dplyr multiple columns

I am trying to apply a complex function on multiple columns after applying a group on it.

Code example is:

library(dplyr)
data(iris)

add = function(x,y) {
    z = x+y
    return(mean(z))
}

iris %>%
  group_by(Species) %>%
  summarise_at(.vars=c('Sepal.Length', 'Sepal.Width'), 
           .funs =  add('Sepal.Length', 'Sepal.Width' ) )

I was expecting that the function would be applied to each group and returned as a new column but I get:

Error in x + y : non-numeric argument to binary operator

How can I get this work?

Note my real problem has a much more complicated function than the simple add function I've written here that requires the two columns be fed in as separate entities I can't just sum them first.

Thanks

Upvotes: 4

Views: 6561

Answers (2)

6 Pool
6 Pool

Reputation: 1

summarize() already allows you to summarize multiple columns.

example:

summarize(mean_xvalues = mean(x) , sd_yvalues = sd(y), median_zvalues = median(z)) 

where x,y,z are columns of a dataframe.

Upvotes: 0

Aramis7d
Aramis7d

Reputation: 2496

Don't think you need summarise_at, since your definition of add takes care fo the multiple input arguments. summarise_at is useful when you are applying the same change to multiple columns, not for combining them.

If you just want sum of the columns, you can try:

iris %>%
  group_by(Species) %>%
  summarise_at(
    .vars= vars( Sepal.Length, Sepal.Width), 
    .funs =  sum) 

which gives:

     Species Sepal.Length Sepal.Width
      <fctr>        <dbl>       <dbl>
1     setosa          250         171
2 versicolor          297         138
3  virginica          329         149

in case you want to add the columns together, you can just do:

iris %>%
  group_by(Species) %>%
  summarise( k = sum(Sepal.Length, Sepal.Width))

which gives:

     Species     k
      <fctr> <dbl>
1     setosa   422
2 versicolor   435
3  virginica   478

using this form with your definition of add

add = function(x,y) {
  z = x+y
  return(mean(z))
}


iris %>%
  group_by(Species) %>%
  summarise( k = add(Sepal.Length, Sepal.Width))

returns

     Species     k
      <fctr> <dbl>
1     setosa     8
2 versicolor     9
3  virginica    10

Upvotes: 3

Related Questions