Reputation: 2867
Is it possible to pass column indices to read_csv?
I am passing many CSV files to read_csv with different header names so rather than specifying names I wish to use column indices.
Is this possible?
df.list <- lapply(myExcelCSV, read_csv, skip = headers2skip[i]-1)
Upvotes: 4
Views: 799
Reputation: 226232
Alternatively, you can use a compact string representation where each character represents one column: c = character, i = integer, n = number, d = double, l = logical, f = factor, D = date, T = date time, t = time, ? = guess, or ‘_’/‘-’ to skip the column.
If you know the total number of columns in the file you could do it like this:
my_read <- function(..., tot_cols, skip_cols=numeric(0)) {
csr <- rep("?",tot_cols)
csr[skip_cols] <- "_"
csr <- paste(csr,collapse="")
read_csv(...,col_types=csr)
}
If you don't know the total number of columns in advance you could add code to this function to read just the first line of the file and count the number of columns returned ...
FWIW the skip
argument might not do what you think it does (it skips rows rather than selecting/deselecting columns): as I read ?readr::read_csv()
there doesn't seem to be any convenient way to skip and/or include particular columns (by name or by index) except by some ad hoc mechanism such as suggested above; this might be worth a feature request/discussion on the readr
issues list? (e.g. add cols_include
and/or cols_exclude
arguments that could be specified by name or position?)
Upvotes: 6