Reputation: 5646
I want to read multiple files. To do this I use a generic function read_list
read_list(file_list, read_fun)
Assigning different read function to the argument read_fun
I can read different kinds of files, i.e. read.csv
for reading csv
files, read_dta
for STATA files, etc.
Now, I need to read some csv
files where the first four lines need to be skipped. Thus, instead than passing read.csv
as an argument to read_list
, I would like to pass read.csv
with the skip
argument set to 4. Is it possible to do this in R? I tried
my_read_csv <- function(...){
read.csv(skip = 4, ...)
}
It seems to work, but I would like to confirm that this is the right way to do it. I think that functions being objects in R is a fantastic and very powerful feature of the language, but I'm not very familiar with R closures and scoping rules, thus I don't want to inadvertently make some big mistake.
Upvotes: 0
Views: 378
Reputation: 2408
You can simply rewrite your read_list
to add the unnamed argument qualifier ...
at the end and then replace the call to
read_fun(file)
with read_fun(file, ...)
.
This will allow you to write the following syntax:
read_list(files, read.csv, skip = 4)
wich will be equivalent to using your current read_list
with a cusom read function:
read_list(files, function(file)read.csv(file, skip = 4))
Also, be aware that read_list
sounds awfully lot like a "reinvent the wheel" function. If you describe the behaviour of read_list
a little more, I can expand.
Possible alternatives may be
read_list <- function(files, read_fun, ...)lapply(files, read_fun, ...)
# in this case read_list is identical to lapply
read_list <- function(files, read_fun, ...)do.call(rbind, lapply(files, read_fun, ...))
# This will rbind() all the files to one data.frame
Upvotes: 1
Reputation: 606
I'm not sure if read_list
is specialized to your specific task in some way but you can use lapply
along with read.csv
to read a list of files:
# generate fake file names
files <- paste0('file_', 1:10, '.csv')
# Read files using lapply
dfs <- lapply(files, read.csv, skip = 4)
The third argument of lapply
is ...
which allows you to pass additional arguments to the function you're applying. In this case, we can use ...
to pass the skip = 4
argument to read.csv
Upvotes: 0