Reputation: 4603
I'm writing a function in R that, among other things, subsets a data frame based on a vector of column names. I'm trying to exploit the default behavior of [.data.frame
which will return all columns if the 'j' argument is missing. Is there a way to pass a missing argument through my wrapper function? Here's a bare bones example:
fixDataFrames <- function(listOfDataFrames, columns){
lapply(listOfDataFrames, function(x) x[,columns])
}
If I do not specify a value for columns, I get an error when passing it to the [
function: "columns argument is missing, with no default".
Upvotes: 4
Views: 6111
Reputation: 174778
A slightly different tack is to not have an anonymous function and call [
directly.
fixDataFrames <- function(listOfDataFrames, columns = TRUE, drop = TRUE){
lapply(listOfDataFrames, `[`, , j = columns, drop = drop)
}
Note that the blank space between the two ,
is important as this represents the space for i
the row index. By leaving this missing we get the same behaviour as df[ , columns]
. I also set drop = TRUE
as that is the default for [
so the fun maintains behaviour.
With the same data from @Chase's Answer:
## Sample data
df1 <- df2 <- data.frame(x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))
listOfDataFrames <- list(df1, df2)
fixDataFrames(listOfDataFrames)
fixDataFrames(listOfDataFrames, 2)
fixDataFrames(listOfDataFrames, 2, drop = FALSE)
Giving
> fixDataFrames(listOfDataFrames)
[[1]]
x1 x2 x3
1 -1.98347150 -0.50473182 0.56554491
2 -0.19597580 0.41004825 -0.35646296
3 0.81792146 -0.07646175 -2.02534426
4 -0.01903514 0.70687248 -0.25373188
5 -0.49233958 0.42497338 -0.15647100
6 0.62296268 1.88127659 0.41952414
7 -0.27260248 -2.59046602 -1.99294060
8 1.46344557 1.44803287 0.08634971
9 0.62207040 1.78290849 -0.17131320
10 -1.05730518 -0.45478467 1.15346862
[[2]]
x1 x2 x3
1 -1.98347150 -0.50473182 0.56554491
2 -0.19597580 0.41004825 -0.35646296
3 0.81792146 -0.07646175 -2.02534426
4 -0.01903514 0.70687248 -0.25373188
5 -0.49233958 0.42497338 -0.15647100
6 0.62296268 1.88127659 0.41952414
7 -0.27260248 -2.59046602 -1.99294060
8 1.46344557 1.44803287 0.08634971
9 0.62207040 1.78290849 -0.17131320
10 -1.05730518 -0.45478467 1.15346862
> fixDataFrames(listOfDataFrames, 2)
[[1]]
[1] -0.50473182 0.41004825 -0.07646175 0.70687248 0.42497338 1.88127659
[7] -2.59046602 1.44803287 1.78290849 -0.45478467
[[2]]
[1] -0.50473182 0.41004825 -0.07646175 0.70687248 0.42497338 1.88127659
[7] -2.59046602 1.44803287 1.78290849 -0.45478467
> fixDataFrames(listOfDataFrames, 2, drop = FALSE)
[[1]]
x2
1 -0.50473182
2 0.41004825
3 -0.07646175
4 0.70687248
5 0.42497338
6 1.88127659
7 -2.59046602
8 1.44803287
9 1.78290849
10 -0.45478467
[[2]]
x2
1 -0.50473182
2 0.41004825
3 -0.07646175
4 0.70687248
5 0.42497338
6 1.88127659
7 -2.59046602
8 1.44803287
9 1.78290849
10 -0.45478467
Upvotes: 2
Reputation: 61903
You could set a default for columns such that if nothing is supplied it grabs all of the columns. Using TRUE should work
fixDataFrames <- function(listOfDataFrames, columns = TRUE){
lapply(listOfDataFrames, function(x) x[,columns])
}
# As Chase points out it is probably more prudent to add drop=FALSE as a parameter
fixDataFrames <- function(listOfDataFrames, columns = TRUE, drop = FALSE){
lapply(listOfDataFrames, function(x) x[, columns, drop = drop])
}
Upvotes: 6
Reputation: 69151
This seems like a hack, but setting up the second argument as ...
allows for this behaviour:
#Sample data
df1 <- df2 <- data.frame(x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))
listOfDataFrames <- list(df1, df2)
fixDataFrames <- function(listOfDataFrames, ...){
lapply(listOfDataFrames, function(x) x[,...])
}
> fixDataFrames(listOfDataFrames)
[[1]]
x1 x2 x3
1 -1.7475354 -1.3444461 0.2049100
2 0.1451163 1.4396253 0.5885829
...
[[2]]
x1 x2 x3
1 -1.7475354 -1.3444461 0.2049100
2 0.1451163 1.4396253 0.5885829
You might also want to add , drop = FALSE
to prevent the data.frame from being coerced into a vector if a single column is selected.
Upvotes: 1
Reputation: 109844
This is untested but try:
fixDataFrames <- function(listOfDataFrames, columns){
lapply(listOfDataFrames, function(x)
if (missing(columns)) {
columns <- 1:ncol(x)
}
x[,columns]
)
}
Upvotes: 0