Cybernetic
Cybernetic

Reputation: 13354

Read CSV into R based on where header begins

I have a large number of CSV files. Some have the header beginning on the first row, others have the header beginning on the 3rd row, others the 7th and so on.

The headers all look the same, they just start on different rows across different files. Is there a way to conditionally read.csv a file to start where the header begins?

For example, if I know the headers all have the first column names "office#", could I somehow instruct R to start reading the csv file when it first runs into the field "office#" and treat that row as the header?

Upvotes: 4

Views: 1815

Answers (1)

Cybernetic
Cybernetic

Reputation: 13354

I have 4 CSV files:

One table with a header beginning on row 1 (iris.csv)

enter image description here

And 3 tables with headers beginning on rows 3, 1, and 5 (sales_1, sales_2, sales_3)

enter image description here

As long as I know the first column names of each table, I can use the smart_csv_reader function to determine where each header begins, and read each CSV file at the correct row number:

first_columns <- c('sepal.length', 'month', 'month', 'month')

smart_csv_reader <- function(directory) {
    header_begins <- NULL
    file_names <- list.files(directory, pattern=".csv$")
    for(i in 1:length(file_names)) {
        path <- paste(directory, file_names[i], sep='', col='')
        lines_read <- readLines(path, warn=F)
        header_begins[i] <- grep(first_columns[i], lines_read)
    } 
    print('headers detected on rows:')
    print(header_begins)
    l <- list()
    for(i in 1:length(header_begins)) {
        path <- paste(directory, file_names[i], sep='', col='')
        l[i] <- list(read.csv(path, skip=header_begins[i]-1))   
    }
    return(l)
}

Just pass in the directory where all your CSVs are.

Usage:

smart_csv_reader('some_csvs/')

[1] "headers detected on rows:"
[1] 1 3 1 5

As you can see the function returns the correct row numbers for each table. It also returns a list of each table read correctly:

enter image description here

Upvotes: 3

Related Questions