John Thomas
John Thomas

Reputation: 1115

How to do same function on every file in a folder in R?

So I have a folder of identically formatted csv's . Let's call the folder "Folder" and the csv's:

Each csv is formatted as follows

ID   date        hours  info
001  01/01/2019  8      xxxx
002  01/01/2019  22     xxxx
003  01/02/2019  4      xxxx
004  01/02/2019  5      xxxx

So the following works if I want one to work but how could I run and combine across all files in the folder?

totals <- df %>%
            group_by(date) %>%
            summarize(hour_sum = sum(hours)

So basically I want to have a dataframe which has every date in all files and the sum of the hours from ALL files.

So if 01/02/2019 appears in 3 files, I want the sum of hours for every occurence of that date in one df.

Upvotes: 0

Views: 312

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 102810

Maybe you could try the code below

aggregate(
  hours ~ date,
  do.call(rbind, c(lapply(list.files(pattern = "test\\d+\\.csv"), read.csv), make.row.names = FALSE)),
  sum
)

Upvotes: 0

nniloc
nniloc

Reputation: 4243

If you are willing to use the whole tidyverse set of packages, purrr gives you map_dfr, which returns a single dataframe by rbinding each dataset you read in. More info about it here.

The code would look something like this:

library(tidyverse)

list.files(path = "path_to_data", full.names = TRUE) %>%
  map_dfr(read.csv) %>%
  group_by(date) %>%
  summarize(hour_sum = sum(hours)) 

Upvotes: 1

Related Questions