William
William

Reputation: 402

Uploading multiple files in Shiny, process the files using lapply, rbind the results and return a download

In my previous post, I was able to upload multiple files in Shiny, process the files, rbind the results and return a csv file download, using a for loop. Thanks for the contribution of @SBista. However, because I have to upload a lot of files at a time (total size of about 50 - 100mb), I found running the shiny app to be very slow perhaps due to the for loop. I know that lapply() is faster in reading multiple csv files than for loop, but applying lapply() in my code gives an error (ERROR: Invalid 'description' argument ) after running the app. Any help will be appreciated. This is my dummy file, and this is my code:

 library(shiny)

 ui <- fluidPage(
   fluidPage(
     titlePanel("MY CSV FILES MERGER"),
     sidebarLayout(
       sidebarPanel(
         fileInput("file1",
              "Choose CSV files from a directory",
              multiple = TRUE,
              accept=c('text/csv', 
                       'text/comma-separated-values,text/plain', 
                       '.csv')),
         downloadButton('downloadData', 'Download')
       ),
       mainPanel(
         tableOutput('contents')
       )
     )
   )
 )

 library(shiny)
 library(dplyr)
 options(shiny.maxRequestSize = 100*1024^2)
 server <-  function(input, output) {
   getData <- reactive({
     inFile <- input$file1
     if (is.null(inFile)){
       return(NULL)
     }else {   
      files3 = lapply(inFile, function(y){
        JSON_csv = read.csv(y, header = TRUE)
        lastrow = nrow(JSON_csv)
        shift = function(x, n){
          c(x[-(seq(n))], rep(NA, n))
        }
        JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
        JSON_csv = JSON_csv[-lastrow, ]
        JSON_csv 
      }

                 )
       do.call(rbind, files3)
     }
   })
   output$contents <- renderTable( 
     getData() 
   )
   output$downloadData <- downloadHandler(
     filename = function() { 
       paste("data-", Sys.time(), ".csv", sep="")
     },
     content = function(file) { 
       write.csv(getData(), file, row.names=FALSE)   
     })
 }

 shinyApp(ui = ui, server = server)

With for loop, this code works but it is very very slow while working with multiple csv files of 50-100mb:

 library(shiny)
 library(dplyr)
 server <-  function(input, output) {
 getData <- reactive({
  inFile <- input$file1
  if (is.null(inFile)){
    return(NULL)
  }else {
    # browser()
    numfiles = nrow(inFile) 
    kata_csv1 = list()


    for (i in 1:numfiles)
    {

      JSON_csv = read.csv(input$file1[[i, 'datapath']], header = TRUE)
      lastrow = nrow(JSON_csv)
      shift = function(x, n){
        c(x[-(seq(n))], rep(NA, n))
      }
      JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
      kata_csv1[[i]] = JSON_csv[-lastrow, ]

    }
    # browser()
    do.call(rbind, kata_csv1)
     }
   })
  output$contents <- renderTable( 
  getData() 
  )
  output$downloadData <- downloadHandler(
  filename = function() { 
    paste("data-", Sys.Date(), ".csv", sep="")
  },
  content = function(file) { 
    write.csv(getData(), file, row.names=FALSE)   
  })
  }

 shinyApp(ui = ui, server = server)

Upvotes: 1

Views: 2575

Answers (2)

astronomerforfun
astronomerforfun

Reputation: 349

Here's a possible solution for files without using for loop:

 library(readxl)
    file.list <- list.files(pattern='*.xlsx')
    df.list <- lapply(file.list, read_excel)

Shiny. I run all my files in a separate script and reference the script using 'source' in the shiny app

source("SCRIPTGEO.R", local = TRUE) 

Here's a link on pulling multiple files. Reading multiple files into R - Best practice

Here's what I did on my app in the Script. I'm no expert so there may be other ways...

    fils1 <- list.files(pattern = ".csv")


allquotes <- function(fils1){
  dfs <- lapply(fils1, function(x){
  df <- read.csv(x, header = T, stringsAsFactors = FALSE)
  df <- df[-c(1,nrow(df)),]
  df <- df[,c(1,2,3,5,6,7,8)]
  colnames(df) <-  c("ID", "ENTRY_DATE", "QUOTEORORDER","BILL.TO", "NAME", "BRANCH", "CONVERTED")
  return(df)
  })
  testbind <- do.call("rbind", dfs)
  return(testbind)
}

Upvotes: 1

SBista
SBista

Reputation: 7704

the problem is when you are passing inFile to lapply you are actually passing only the first column containing the filename. Instead you'll need to passinFile$datapath. The lapply should be like this:

   files3 = lapply(inFile$datapath, function(y){

     JSON_csv = read.csv(y, header = TRUE)
     lastrow = nrow(JSON_csv)
     shift = function(x, n){
       c(x[-(seq(n))], rep(NA, n))
     }
     JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
     JSON_csv = JSON_csv[-lastrow, ]
     JSON_csv 
   }

Hope it helps!

Upvotes: 1

Related Questions