Reputation: 1115
So I have a folder of identically formatted csv's . Let's call the folder "Folder" and the csv's:
Each csv is formatted as follows
ID date hours info
001 01/01/2019 8 xxxx
002 01/01/2019 22 xxxx
003 01/02/2019 4 xxxx
004 01/02/2019 5 xxxx
So the following works if I want one to work but how could I run and combine across all files in the folder?
totals <- df %>%
group_by(date) %>%
summarize(hour_sum = sum(hours)
So basically I want to have a dataframe which has every date in all files and the sum of the hours from ALL files.
So if 01/02/2019
appears in 3 files, I want the sum of hours for every occurence of that date in one df.
Upvotes: 0
Views: 312
Reputation: 102810
Maybe you could try the code below
aggregate(
hours ~ date,
do.call(rbind, c(lapply(list.files(pattern = "test\\d+\\.csv"), read.csv), make.row.names = FALSE)),
sum
)
Upvotes: 0
Reputation: 4243
If you are willing to use the whole tidyverse
set of packages, purrr
gives you map_dfr
, which returns a single dataframe by rbinding each dataset you read in. More info about it here.
The code would look something like this:
library(tidyverse)
list.files(path = "path_to_data", full.names = TRUE) %>%
map_dfr(read.csv) %>%
group_by(date) %>%
summarize(hour_sum = sum(hours))
Upvotes: 1