Alby
Alby

Reputation: 5742

how to use dplyr function in the following context where I use ddply

The following is what I am trying to do:

dput(dat)
structure(list(group = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("group1", "group2", 
"group3"), class = "factor"), value = c(34L, 143L, 36L, 23L, 
134L, 24L, 28L, 120L, 36L, 24L, 155L, 43L, 25L, 145L, 12L)), .Names = c("group", 
"value"), row.names = c(NA, -15L), class = "data.frame")

> dat %>% ddply(.(group), function(x){sum((x$value-mean(x$value))^2)}) %>% .[["V1"]] %>% sum()
[1] 1372.8

basically, compute the sum of squares by the group and sum the result. When I tried to achieve the same goal with the dplyr, I get the following error:

> dat %>% group_by(group) %>% do(function(x) {x$value-mean(x$value)})
Error: Results are not data frames at positions: 1, 2, 3

Upvotes: 0

Views: 249

Answers (2)

lukeA
lukeA

Reputation: 54237

Maybe try

library(dplyr)
dat %>% 
  group_by(group) %>% 
  summarise(V1 =  sum((value - mean(value))^2)) %>% 
  summarise(V1 = sum(V1)) %>% 
  .$V1
# [1] 1372.8

or, if you want do:

dat %>% 
  group_by(group) %>% 
  do({data.frame(V1 = sum((.$value-mean(.$value))^2))}) %>% 
  ungroup() %>% 
  summarise(V1 = sum(V1)) %>% 
  .$V1
# [1] 1372.8

Upvotes: 2

akrun
akrun

Reputation: 887048

You could try with summarise, extract the "V1" column and sum

dat %>% 
    group_by(group) %>% 
    dplyr::summarise(V1=sum((value-mean(value))^2))%>%
    .$V1 %>% 
    sum()
#[1] 1372.8

Upvotes: 2

Related Questions