Reputation: 2443
I have a data frame, df
, on which I would like to run a the function kepdf
(from the package pdfCluster
which calculates multivariate density). The point is this is not a simple base function like head, mean and the likes.
My data frame looks like this:
> head(df)
# A tibble: 6 x 4
A B C Group
<dbl> <dbl> <dbl> <dbl>
2 1 39 1
2 2 66 1
2 2 36 1
1 1 56 1
1 1 37 1
1 1 45 1
Now, I would like to calculate the density of columns A
, B
, and C
for each Group
separately (the variable Group
just indicates the group the observation belongs to and should not enter the density calculation). I naively tried the following:
df %>% group_by(Group) %>% select(1:3) %>% do(kepdf(.))
and got the following error:
Adding missing grouping variables: `Group`
Error in kepdf(.) : NA/NaN/Inf in foreign function call (arg 2)
Now, there are no missing values in the data, so I'm confused. Also, I don't want to add the grouping variable Group
because then the algorithm will add it to the density calculation, which I don't want it to do.
Any thoughts?
Upvotes: 0
Views: 167
Reputation: 13581
Your issue is that you're grouping your data.frame by Group
then trying to discard the grouping column before performing kepdf(...)
. When you call do(...)
, it adds back the grouping column necessarily.
Try instead
library(purrr)
df %>% split(.$Group) %>% map(., ~select(.x, 1:3)) %>% map(., ~kepdf(.x))
You can always combine the last two map(...)
into a single function
myfun <- function(df) {
require(pdfCluster)
data <- select(df, 1:3)
kepdf(data)
}
df %>% split(.$Group) %>% map(., ~myfun(.x))
Upvotes: 1