Ignore files with parsing errors in import (read_csv)

Question

I have incredibly raw data in the format of a .zip with a .txt file inside. For the most part, it cleanly reads in using read_csv, but there are some lines where the data is logging something else and completely skews the column structure. This data has no chance of being fixed.

When using read_csv, it shows up as a parsing problem. I want to set up my code where if this problem appears in the data, the whole file is ignored. It'd be great if there was a log of which files were ignored/thrown out. I looked into possibly(), but since it's not a full error with the file, only the lines, it doesn't skip the file.

This is my code at the moment.

library(dplyr)
library(readr)
library(purrr)

read_log <- function(path) {
  read_csv(path, col_types = cols(.default = col_character())) %>%
    mutate(filename = basename(path))
}

test_files <- file.path("example.txt") #would normally be list.files, simplified for this reprex

raw_data <- map_dfr(test_files, read_log)
#> Warning: 6 parsing failures.
#> row col   expected     actual          file
#>   3  -- 17 columns 4 columns  'example.txt'
#>   4  -- 17 columns 23 columns 'example.txt'
#>   5  -- 17 columns 23 columns 'example.txt'
#>   6  -- 17 columns 23 columns 'example.txt'
#>   7  -- 17 columns 23 columns 'example.txt'
#> ... ... .......... .......... .............
#> See problems(...) for more details.

Ignore files with parsing errors in import (read_csv)

Answers (1)

Related Questions