Nicholas Root
Nicholas Root

Reputation: 555

Getting the tidyr::nest() -> purrr:map() workflow to work for special case of no grouping var

I'm trying to write a function that does a split-apply-combine for which the split variable(s) are parameters, and - importantly - a null split is acceptable. For example, running statistics either on subsets of data or on the entire dataset.

somedata=expand.grid(a=1:3,b=1:3)

somefun=function(df_in,grpvars=NULL){

  df_in %>% group_by_(.dots=grpvars) %>% nest() %>%
    mutate(X2.Resid=map(data,~with(.x,chisq.test(b)$residuals))) %>%
    unnest(data,X2.Resid) %>% return()

}

somefun(somedata,"a") # This works
somefun(somedata) # This fails

The null condition fails because nest() seems to need a variable to nest by, rather than nesting the entire df into a 1x1 data.frame. I can get around this as follows:

somefun2=function(df_in,grpvars="Dummy"){

  df_in$Dummy=1
  df_in %>% group_by_(.dots=grpvars) %>% nest() %>%
    mutate(X2.Resid=map(data,~with(.x,chisq.test(b)$residuals))) %>%
    unnest(data,X2.Resid) %>%
    select(-Dummy) %>% return()

}

somefun2(somedata) # This works

However, I'm wondering if there is a more elegant way to fix this, without needing the dummy variabe?

Upvotes: 1

Views: 293

Answers (1)

Axeman
Axeman

Reputation: 35187

Hmm, that behavior is a little surprising to me. A fix is easy though: you just have to make sure you nest everything():

somefun3 <- function(df_in, grpvars = NULL) {
  df_in %>% 
    group_by_(.dots = grpvars) %>% 
    nest(everything()) %>% 
    mutate(X2.Resid = map(data, ~with(.x, chisq.test(b)$residuals))) %>%
    unnest()
}
somefun3(somedata, "a")
somefun3(somedata) 

Both work.

Upvotes: 4

Related Questions