Mark
Mark

Reputation: 2889

Different functions over a list of columns and generate new column names automatically with data.table

I have a section in my Shiny app that generates a list.

names of the list are column names of the dataframe we will calculate on, list items contain the calculations we want

Looking to do this:
apply to all list names:
for listname (column) x calculate function n,m,o over df column x
and name the resulting column 'x.n' i.e. 'cyl.mean', 'mpg.sum'
to get a dataframe of summary statistics PER GROUP (mtcars$cyl) in this case as example

It is linked to this question, but there the example data used a separate list of column names, and apply the same functions to all those columns from another list. I'm looking to move forward to apply unique sets of functions to different columns

The list my app spits out looks like this:

mylist


$disp
[1] "sum"  "mean"

$hp
[1] "sd"

$drat
[1] "sum"  "mean"

$wt
[1] "max"

expected output:

cyl    disp.sum  hp.sd  drat.sum  drat.mean wt.max    
4        x ....  
6        x ....  
8        x  ....  

The little Shiny app to create the list:

library(shiny)
library(data.table)
library(shinyjs)

Channels <- names(mtcars)[3:8]

ui <- fluidPage(
  shinyjs::useShinyjs(),
  h5('Channels', style = 'font-weight:bold'),
  uiOutput('ChannelCheckboxes'),
  h5('Statistics', style = 'font-weight:bold'),
  uiOutput('CalculationCheckboxes')

)


server <- function(input, output, session) {

  values <- reactiveValues(Statisticlist = list())
  ## build observer to deselect all sub category checkboxes if channel is deselected
  lapply(Channels, function(x) {
    observeEvent(input[[paste('Channel', x, sep = '')]], {
      if(!input[[paste('Channel', x, sep = '')]]) {
        shinyjs::disable(paste("Calculations", x, sep = ''))
        updateCheckboxGroupInput(session, inputId = paste("Calculations", x, sep = ''), selected=character(0))

      } else {
        shinyjs::enable(paste("Calculations", x, sep = ''))

        }
    })
  })

  output$ChannelCheckboxes <- renderUI({
    fluidRow(
    lapply(Channels, function(x) {
      column(2,
             checkboxInput(inputId = paste('Channel', x, sep = ''), label = x)
        )
    })
  )
  })

output$CalculationCheckboxes <- renderUI({
  fluidRow(
    lapply(Channels, function(x) {
      column(2,
             checkboxGroupInput(inputId = paste("Calculations", x, sep = ''),  label = NULL, c('sum', 'mean', 'length', 'max', 'min', 'sd')) ) })
  )
})


  lapply(Channels, function(x) {
    observe({
      req(input[[paste('Channel', x, sep = '')]])
      if(input[[paste('Channel', x, sep = '')]] & !is.null(input[[paste("Calculations", x, sep = '')]])){
     values$Statisticlist[[paste(x)]] <- input[[paste("Calculations", x, sep = "")]]

      }
    })
  })


  observeEvent(values$Statisticlist, { print(values$Statisticlist)
    mylist <<- values$Statisticlist
    })
}

shinyApp(ui, server)

Upvotes: 2

Views: 221

Answers (2)

Mark
Mark

Reputation: 2889

To turn Uwe's answer into a function I did this:

Summarystats <- function(statlist, dataframe, group) { 
    statlist %>%
        names() %>% 
        lapply(
            function(.col) lapply(
                statlist[[.col]], 
                function(.fct) sprintf("%s.%s = %s(%s)", .col, .fct, .fct, .col))) %>% 
        unlist() %>% 
        paste(collapse = ", ") %>% 
        sprintf("as.data.table(dataframe)[, .(%s), by = group]", .) %>% 
        parse(text = .) %>% 
        eval()
    }

Now I can call:

Summarystats(mylist, mtcars, 'cyl')

allowing me to call a summary table for whichever dataframe and grouping the user wants in my Shiny App.

Upvotes: 0

Uwe
Uwe

Reputation: 42544

If I understand correctly, the question is not about in first place but about how to apply different aggregation functions to specific columns of a .

The names of the columns and the functions which are to be applied on are given as list mylist which is created by the shiny app.

Among the various approaches my preferred option is to compute on the language, i.e., to create a complete expression from the contents of mylist and to evaluate it:

library(magrittr)
library(data.table)
mylist %>%
  names() %>% 
  lapply(
    function(.col) lapply(
      mylist[[.col]], 
      function(.fct) sprintf("%s.%s = %s(%s)", .col, .fct, .fct, .col))) %>% 
  unlist() %>% 
  paste(collapse = ", ") %>% 
  sprintf("as.data.table(mtcars)[, .(%s), by = cyl]", .) %>% 
  parse(text = .) %>% 
  eval()

which yields the expected result

   cyl disp.sum disp.mean    hp.sd drat.sum drat.mean wt.max
1:   6   1283.2  183.3143 24.26049    25.10  3.585714  3.460
2:   4   1156.5  105.1364 20.93453    44.78  4.070909  3.190
3:   8   4943.4  353.1000 50.97689    45.21  3.229286  5.424

The character string which is parsed is created by

mylist %>%
  names() %>% 
  lapply(
    function(.col) lapply(
      mylist[[.col]], 
      function(.fct) sprintf("%s.%s = %s(%s)", .col, .fct, .fct, .col))) %>% 
  unlist() %>% 
  paste(collapse = ", ") %>% 
  sprintf("as.data.table(mtcars)[, .(%s), by = cyl]", .)

and looks as if coded manually:

[1] "as.data.table(mtcars)[, .(disp.sum = sum(disp), disp.mean = mean(disp), hp.sd = sd(hp), drat.sum = sum(drat), drat.mean = mean(drat), wt.max = max(wt)), by = cyl]"

Data

For demonstration, mylist is provided "hard-coded":

mylist <- list(
  disp = c("sum", "mean"),
  hp = "sd",
  drat = c("sum", "mean"),
  wt = "max")

Upvotes: 2

Related Questions