Jesse Anderson
Jesse Anderson

Reputation: 4603

Passing a missing argument in R

I'm writing a function in R that, among other things, subsets a data frame based on a vector of column names. I'm trying to exploit the default behavior of [.data.frame which will return all columns if the 'j' argument is missing. Is there a way to pass a missing argument through my wrapper function? Here's a bare bones example:

fixDataFrames <- function(listOfDataFrames, columns){
    lapply(listOfDataFrames, function(x) x[,columns])
}

If I do not specify a value for columns, I get an error when passing it to the [ function: "columns argument is missing, with no default".

Upvotes: 4

Views: 6111

Answers (4)

Gavin Simpson
Gavin Simpson

Reputation: 174778

A slightly different tack is to not have an anonymous function and call [ directly.

fixDataFrames <- function(listOfDataFrames, columns = TRUE, drop = TRUE){
    lapply(listOfDataFrames, `[`, , j = columns, drop = drop)
}

Note that the blank space between the two , is important as this represents the space for i the row index. By leaving this missing we get the same behaviour as df[ , columns]. I also set drop = TRUE as that is the default for [ so the fun maintains behaviour.

With the same data from @Chase's Answer:

## Sample data
df1 <- df2 <- data.frame(x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))
listOfDataFrames <- list(df1, df2)

fixDataFrames(listOfDataFrames)
fixDataFrames(listOfDataFrames, 2)
fixDataFrames(listOfDataFrames, 2, drop = FALSE)

Giving

> fixDataFrames(listOfDataFrames)
[[1]]
            x1          x2          x3
1  -1.98347150 -0.50473182  0.56554491
2  -0.19597580  0.41004825 -0.35646296
3   0.81792146 -0.07646175 -2.02534426
4  -0.01903514  0.70687248 -0.25373188
5  -0.49233958  0.42497338 -0.15647100
6   0.62296268  1.88127659  0.41952414
7  -0.27260248 -2.59046602 -1.99294060
8   1.46344557  1.44803287  0.08634971
9   0.62207040  1.78290849 -0.17131320
10 -1.05730518 -0.45478467  1.15346862

[[2]]
            x1          x2          x3
1  -1.98347150 -0.50473182  0.56554491
2  -0.19597580  0.41004825 -0.35646296
3   0.81792146 -0.07646175 -2.02534426
4  -0.01903514  0.70687248 -0.25373188
5  -0.49233958  0.42497338 -0.15647100
6   0.62296268  1.88127659  0.41952414
7  -0.27260248 -2.59046602 -1.99294060
8   1.46344557  1.44803287  0.08634971
9   0.62207040  1.78290849 -0.17131320
10 -1.05730518 -0.45478467  1.15346862

> fixDataFrames(listOfDataFrames, 2)
[[1]]
 [1] -0.50473182  0.41004825 -0.07646175  0.70687248  0.42497338  1.88127659
 [7] -2.59046602  1.44803287  1.78290849 -0.45478467

[[2]]
 [1] -0.50473182  0.41004825 -0.07646175  0.70687248  0.42497338  1.88127659
 [7] -2.59046602  1.44803287  1.78290849 -0.45478467

> fixDataFrames(listOfDataFrames, 2, drop = FALSE)
[[1]]
            x2
1  -0.50473182
2   0.41004825
3  -0.07646175
4   0.70687248
5   0.42497338
6   1.88127659
7  -2.59046602
8   1.44803287
9   1.78290849
10 -0.45478467

[[2]]
            x2
1  -0.50473182
2   0.41004825
3  -0.07646175
4   0.70687248
5   0.42497338
6   1.88127659
7  -2.59046602
8   1.44803287
9   1.78290849
10 -0.45478467

Upvotes: 2

Dason
Dason

Reputation: 61903

You could set a default for columns such that if nothing is supplied it grabs all of the columns. Using TRUE should work

fixDataFrames <- function(listOfDataFrames, columns = TRUE){
    lapply(listOfDataFrames, function(x) x[,columns])
}

# As Chase points out it is probably more prudent to add drop=FALSE as a parameter
fixDataFrames <- function(listOfDataFrames, columns = TRUE, drop = FALSE){
    lapply(listOfDataFrames, function(x) x[, columns, drop = drop])
}

Upvotes: 6

Chase
Chase

Reputation: 69151

This seems like a hack, but setting up the second argument as ... allows for this behaviour:

#Sample data
df1 <- df2 <- data.frame(x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))
listOfDataFrames <- list(df1, df2)


fixDataFrames <- function(listOfDataFrames, ...){
  lapply(listOfDataFrames, function(x) x[,...])
}

> fixDataFrames(listOfDataFrames)
[[1]]
           x1         x2         x3
1  -1.7475354 -1.3444461  0.2049100
2   0.1451163  1.4396253  0.5885829
...
[[2]]
           x1         x2         x3
1  -1.7475354 -1.3444461  0.2049100
2   0.1451163  1.4396253  0.5885829

You might also want to add , drop = FALSE to prevent the data.frame from being coerced into a vector if a single column is selected.

Upvotes: 1

Tyler Rinker
Tyler Rinker

Reputation: 109844

This is untested but try:

fixDataFrames <- function(listOfDataFrames, columns){
    lapply(listOfDataFrames, function(x) 
        if (missing(columns)) {
            columns <- 1:ncol(x)
        }
        x[,columns]
    )
}

Upvotes: 0

Related Questions