michaela stubbers
michaela stubbers

Reputation: 61

R: expss package: Error in do.call(data.frame, c(x, alis)) : variable names are limited to 10000 bytes

I defined some table functions in R using the expss package to automate tabulation. One of my tables wants to show cases or percentages on categories followed by the mean. The mean can be based on the same category variable or it can be defined to be another variable. Overall the code works perfect. For some variables though I keep struggling with the error code "Error in do.call(data.frame, c(x, alis)) : variable names are limited to 10000 bytes"

the code for this table

  Table2 = function (Q, banner=banner, caption , Q.mean, ddata=d, questlab=dquest, mis.val=999) {
  x_totaln<-eval(substitute(x),ddata)
  x_totaln[is.na(eval(substitute(Q),ddata))]<-NA
  if(missing(Q.mean))
  {Q_mean<-eval(substitute(Q),ddata)}
  else 
  {Q_mean<-eval(substitute(Q.mean),ddata)}
  Q_mean[Q_mean==mis.val]<-NA
  if(missing(caption))
  {caption<-eval(substitute(var_lab(Q_mean)),questlab)}
  eval.parent(substitute(
    { 
      banner %>%
        tab_cells (x_totaln) %>%
        tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
        tab_cells (Q) %>%
        tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
        tab_stat_cpct(total_row_position = c("none"), label = '%') %>%
        tab_cells (Q_mean) %>%
        tab_stat_mean(label = 'Mean') %>%
        tab_pivot (stat_position = "inside_rows") %>%  
        drop_c ()  %>%
        custom_format2()  %>%
        set_caption(caption)
    }
  ))
}

This code is overall working perfect.

Table2(Q8_cat)

enter image description here

For some variables though it generates the error code

Table2(age_cat,Q.mean=age,caption="Your age at the start of the programme?")
 Error in do.call(data.frame, c(x, alis)) : 
  variable names are limited to 10000 bytes 
19.
do.call(data.frame, c(x, alis)) 

while including the variables in the code works again perfect

Table2test = function () {
  x_totaln<-eval(substitute(x),d)
  x_totaln[is.na(eval(substitute(age_cat),d))]<-NA
  Q_mean<-eval(substitute(age),d)
  Q_mean[Q_mean==999]<-NA
      banner %>%
        tab_cells (x_totaln) %>%
        tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
        tab_cells (age_cat) %>%
        tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
        tab_stat_cpct(total_row_position = c("none"), label = '%') %>%
        tab_cells (Q_mean) %>%
        tab_stat_mean(label = 'Mean') %>%
        tab_pivot (stat_position = "inside_rows") %>%  
        drop_c ()  %>%
        custom_format2()  %>%
        set_caption("Your age at the start of the programme?")
  
}

enter image description here

Any advice? Or anyone any idea why the error occurs?

Thanks

Upvotes: 1

Views: 578

Answers (1)

Gregory Demin
Gregory Demin

Reputation: 4846

When you substitute variables in some cases they are represented as structure. In this case there is no variable name in the expression but only value: tab_cells(structure(c(22, 23, 22, 23, ... many numbers))). And we try to use this long representation as name in the resulted table. But R has limitation on the length of the names. And here the function fails. Solution is quite simple - we will always set variable labels which we will use as names. So the following code run without any errors:

Table2 = function (Q, banner=banner, caption , Q.mean, ddata=d, questlab=dquest, mis.val=999) {
    x_totaln<-eval(substitute(x),ddata)
    x_totaln[is.na(eval(substitute(Q),ddata))]<-NA
    var_lab(x_totaln) = "Total" # add label for total
    if(missing(Q.mean))
    {Q_mean<-eval(substitute(Q),ddata)}
    else 
    {Q_mean<-eval(substitute(Q.mean),ddata)}
    Q_mean[Q_mean==mis.val]<-NA
    if(missing(caption))
    {caption<-eval(substitute(var_lab(Q_mean)),questlab)}
    eval.parent(substitute(
        { 
            banner %>%
                tab_cells (x_totaln) %>%
                tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
                tab_cells (Q) %>%
                tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
                tab_stat_cpct(total_row_position = c("none"), label = '%') %>%
                tab_cells ("|" = Q_mean) %>% # "|" suppress label for mean
                tab_stat_mean(label = 'Mean') %>%
                tab_pivot (stat_position = "inside_rows") %>%  
                drop_c ()  %>%
                custom_format2()  %>%
                set_caption(caption)
        }
    ))
}

Upvotes: 1

Related Questions