Reputation: 13
I have a question about dplyr. When given the data frame my_data
library(dplyr)
set.seed(20160229)
my_data = data.frame(
y=c(rnorm(1000), rnorm(1000, 0.5), rnorm(1000, 1), rnorm(1000, 1.5)),
x=c(rep('a', 2000), rep('b', 2000)),
m=c(rep('i', 1000), rep('j', 2000), rep('i', 1000)))
case 1:
pdat <- my_data %>%
group_by(x, m) %>%
do(data.frame(loc = density(.$y)$x,
dens = density(.$y)$y))
and case 2:
pdat <- my_data
pdat <- group_by(my_data, x, m)
do(data.frame(pdat,loc=density(pdat$y)$x),dens=density(pdat$y)$y)
Why are these statements different? How can case 2 be changed to match case 1?
Upvotes: 0
Views: 78
Reputation: 436
Your call to do
is missing the .data
argument. You need to either pipe it in, as in your "case 1," or provide it explicitly. Try something like:
do(.data = pdat, data.frame(loc = density(.$y)$x, dens = density(.$y)$y))
And now they match:
my_data %>%
group_by(x, m) %>%
do(data.frame(loc = density(.$y)$x,
dens = density(.$y)$y)) -> a
b <- do(.data= pdat, data.frame(loc = density(.$y)$x, dens = density(.$y)$y))
identical(a,b) # TRUE
Upvotes: 1