lith
lith

Reputation: 949

dplyr: Pass grouped data to a function in summarize()

This is a problem with dplyr I often stumble over. Let's consider the following code:

foo <- function (x, aux) {...}

auxcols <- c("Sepal.Width", "Petal.Width")
group_by(iris, Species) %>%
  summarize(f = foo(Sepal.Length, .[, auxcols]))

NOTE: auxcols is not known in advance.

Here aux is receives the full ungrouped data. This is never what I want.

How would I have to change the call to summarize() so that aux contains only the data from the group that is about to be summarized?

Upvotes: 0

Views: 549

Answers (2)

lith
lith

Reputation: 949

@Onyambu provided the correct solution.

group_by(iris, Species) %>%
  summarize(f = foo(Sepal.Length, cur_data()[, auxcols]))

So easy.

Upvotes: 1

MrGumble
MrGumble

Reputation: 5766

You could change foo such that it receives the two columns as two separate variables, as such:

foo <- function(length, sepal, petal) {
   ## if you really, really need to process it as a data.frame, just re-create it:
  aux <- data.frame(Sepal.Width=sepal, Petal.width=petal)
}

group_by(iris, Species) %>% summarise(f= foo(Sepal.Length, Sepal.Width, Petal.Width))

Upvotes: 0

Related Questions