DeltaIV
DeltaIV

Reputation: 5646

Assign read.csv with some set parameters to a name, in order to pass it to a function

I want to read multiple files. To do this I use a generic function read_list

read_list(file_list, read_fun)

Assigning different read function to the argument read_fun I can read different kinds of files, i.e. read.csv for reading csv files, read_dta for STATA files, etc.

Now, I need to read some csv files where the first four lines need to be skipped. Thus, instead than passing read.csv as an argument to read_list, I would like to pass read.csv with the skip argument set to 4. Is it possible to do this in R? I tried

my_read_csv  <- function(...){
read.csv(skip = 4, ...)
}

It seems to work, but I would like to confirm that this is the right way to do it. I think that functions being objects in R is a fantastic and very powerful feature of the language, but I'm not very familiar with R closures and scoping rules, thus I don't want to inadvertently make some big mistake.

Upvotes: 0

Views: 378

Answers (2)

AlexR
AlexR

Reputation: 2408

You can simply rewrite your read_list to add the unnamed argument qualifier ... at the end and then replace the call to read_fun(file) with read_fun(file, ...).

This will allow you to write the following syntax:

read_list(files, read.csv, skip = 4)

wich will be equivalent to using your current read_list with a cusom read function:

read_list(files, function(file)read.csv(file, skip = 4))

Also, be aware that read_list sounds awfully lot like a "reinvent the wheel" function. If you describe the behaviour of read_list a little more, I can expand.
Possible alternatives may be

read_list <- function(files, read_fun, ...)lapply(files, read_fun, ...)
# in this case read_list is identical to lapply
read_list <- function(files, read_fun, ...)do.call(rbind, lapply(files, read_fun, ...))
# This will rbind() all the files to one data.frame

Upvotes: 1

Mark Timms
Mark Timms

Reputation: 606

I'm not sure if read_list is specialized to your specific task in some way but you can use lapply along with read.csv to read a list of files:

# generate fake file names
files <- paste0('file_', 1:10, '.csv')

# Read files using lapply
dfs <- lapply(files, read.csv, skip = 4)

The third argument of lapply is ... which allows you to pass additional arguments to the function you're applying. In this case, we can use ... to pass the skip = 4 argument to read.csv

Upvotes: 0

Related Questions