DFinch
DFinch

Reputation: 139

Combine and summaries multiple CSV from one folder in R

I am trying to combine multiple CSV files from one folder on the computer (over 200 CSVs) into one new file with all of the relevant data that is needed using R. Below is an example of two of the current CSV files (example data can be found here: https://drive.google.com/drive/folders/1RbQ_cMNX5ys53OMISataTy-ZZlfk1SC8?usp=sharing):

enter image description here

enter image description here

I am trying to create a new CSV file that has the site name, number, number of records, start year, end year, and the number of years between the start and end year. It would look like this with a new row for each of the original CSVs in the folder:

enter image description here

Upvotes: 2

Views: 205

Answers (1)

Paul
Paul

Reputation: 9107

Assuming your CSVs are in a folder ./data, get a vector of the file names.

library(tidyverse)

filenames <- list.files("./data", "*.csv", full.names = TRUE)

Create a function that reads in the data. Read in the first line to get the name. Then read in the rest of the data, skipping the first two rows.

process_csv <- function(filename) {
  name <- read_csv(filename, n_max = 1, col_names = FALSE)[[2]]
  
  number <- read_csv(filename, n_max = 1, col_names = FALSE, skip = 1)[[2]]
  
  read_csv(filename, skip = 2) %>%
    mutate(year = lubridate::year(lubridate::dmy_hm(DATE))) %>%
    summarise(
      count = n(),
      start_year = min(year),
      end_year = max(year),
      year_count = end_year - start_year
    ) %>%
    mutate(
      name = name,
      number = number,
      .before = 1
    )
}

Apply the function and save the output.

result <- map_dfr(filenames, process_csv)

write_csv(result, "result.csv")

result
#> # A tibble: 2 x 6
#>   name        number count start_year end_year year_count
#>   <chr>        <dbl> <int>      <dbl>    <dbl>      <dbl>
#> 1 NEW_PLACE_1    123   281       1972     2020         48
#> 2 NEW_PLACE_2    155   393       1961     2020         59

Upvotes: 2

Related Questions