a11
a11

Reputation: 3396

R apply custom function without for loop, expected output is matrix

How can I apply a custom function, with multiple arguments, to return an N x M dataframe without a for loop?

For example:

mat1 <- data.frame(a = 1:5, b = -1)
vec1 <- 100:107

myfcn <- function(x, vals){
  ans <- (x + vals[1]) * vals[2]
  return(ans)
}

df <- data.frame(matrix(nrow = length(mat1), ncol = length(vec1))) # pre-allocate

for (i in 1:length(vec1)){
  for (j in 1:nrow(mat1)){
    result <- myfcn(vec1[i], vals = c(mat1$a[j], mat1$b[j]))
    df[j,i] <- result
    }
  }
  
print(df)

This returns the desired output matrix:

enter image description here

How can I skip the for-loop and use some kind of apply function to get the above output matrix?

I tried sapply(vec1, myfcn, vals = c(mat1$a, mat1$b)), but it returns this:

enter image description here

And I tried outer(vec1, c(mat1$a, mat1$b), myfcn), but it returns this:

enter image description here

Upvotes: 0

Views: 142

Answers (4)

langtang
langtang

Reputation: 24867

You can simply do this:

result = sapply(vec1,\(v) (v+mat1$a)*mat1$b)

If you want to use myfcn, make a slight change to it:

myfcn <- function(x, v1,v2) (x + v1)*v2
result = sapply(vec1,myfcn,v1=mat1$a, v2=mat1$b)

Output:

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] -101 -102 -103 -104 -105 -106 -107 -108
[2,] -102 -103 -104 -105 -106 -107 -108 -109
[3,] -103 -104 -105 -106 -107 -108 -109 -110
[4,] -104 -105 -106 -107 -108 -109 -110 -111
[5,] -105 -106 -107 -108 -109 -110 -111 -112

If you want to convert to a frame with the same names above, add the following:

setNames(as.data.frame(result), paste0("X",1:ncol(result)))

Output:

    X1   X2   X3   X4   X5   X6   X7   X8
1 -101 -102 -103 -104 -105 -106 -107 -108
2 -102 -103 -104 -105 -106 -107 -108 -109
3 -103 -104 -105 -106 -107 -108 -109 -110
4 -104 -105 -106 -107 -108 -109 -110 -111
5 -105 -106 -107 -108 -109 -110 -111 -112

Upvotes: 2

Arthur
Arthur

Reputation: 2402

pmap is the most elegant solution in my opinion.

In this particular case, there is an extra step to create the input data with vectors of the same length, but then the apply statement is as simple as df$result <- pmap_dbl(df, foo)

library(tidyverse)

foo <- function(x, a, b){
  ans <- (x + a) * b
  return(ans)
}

# input data
df <- data.frame(a = 1:5, b = -1) %>% expand_grid(x = 100:107)

# pmap will used named arguments if the first argument has names
df$result  <- pmap_dbl(df, foo) 

# if argument has no names, will use them as ordered
df$result2 <- pmap_dbl(with(df, list(x, a, b)), foo) 

print(df, n = Inf)
#> # A tibble: 40 × 5
#>        a     b     x result result2
#>    <int> <dbl> <int>  <dbl>   <dbl>
#>  1     1    -1   100   -101    -101
#>  2     1    -1   101   -102    -102
#>  3     1    -1   102   -103    -103
#>  4     1    -1   103   -104    -104
#>  5     1    -1   104   -105    -105
#>  6     1    -1   105   -106    -106
#>  7     1    -1   106   -107    -107
#>  8     1    -1   107   -108    -108
#>  9     2    -1   100   -102    -102
#> 10     2    -1   101   -103    -103
#> 11     2    -1   102   -104    -104
#> 12     2    -1   103   -105    -105
#> 13     2    -1   104   -106    -106
#> 14     2    -1   105   -107    -107
#> 15     2    -1   106   -108    -108
#> 16     2    -1   107   -109    -109
#> 17     3    -1   100   -103    -103
#> 18     3    -1   101   -104    -104
#> 19     3    -1   102   -105    -105
#> 20     3    -1   103   -106    -106
#> 21     3    -1   104   -107    -107
#> 22     3    -1   105   -108    -108
#> 23     3    -1   106   -109    -109
#> 24     3    -1   107   -110    -110
#> 25     4    -1   100   -104    -104
#> 26     4    -1   101   -105    -105
#> 27     4    -1   102   -106    -106
#> 28     4    -1   103   -107    -107
#> 29     4    -1   104   -108    -108
#> 30     4    -1   105   -109    -109
#> 31     4    -1   106   -110    -110
#> 32     4    -1   107   -111    -111
#> 33     5    -1   100   -105    -105
#> 34     5    -1   101   -106    -106
#> 35     5    -1   102   -107    -107
#> 36     5    -1   103   -108    -108
#> 37     5    -1   104   -109    -109
#> 38     5    -1   105   -110    -110
#> 39     5    -1   106   -111    -111
#> 40     5    -1   107   -112    -112

Created on 2022-10-10 with reprex v2.0.2

Upvotes: 1

SamR
SamR

Reputation: 20512

You were on the right lines with outer():

outer(mat1$a, vec1, `+`) * mat1$b
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] -101 -102 -103 -104 -105 -106 -107 -108
# [2,] -102 -103 -104 -105 -106 -107 -108 -109
# [3,] -103 -104 -105 -106 -107 -108 -109 -110
# [4,] -104 -105 -106 -107 -108 -109 -110 -111
# [5,] -105 -106 -107 -108 -109 -110 -111 -112

This will also work if mat1$b is -1:-5, for example - not just in cases where it's a constant.

Upvotes: 1

Rui Barradas
Rui Barradas

Reputation: 76673

First of all, you do not need a double for loop to populate df, R is vectorized and you can pass the entire vec1 to the function each time through the loop.

mat1 <- data.frame(a = 1:5, b = -1)
vec1 <- 100:107

myfcn <- function(x, vals){
  ans <- (x + vals[1]) * vals[2]
  return(ans)
}

df <- as.data.frame(matrix(nrow = length(mat1), ncol = length(vec1))) # pre-allocate

for (j in 1:nrow(mat1)){
  result <- myfcn(vec1, vals = c(mat1$a[j], mat1$b[j]))
  df[j,] <- result
}
df
#>     V1   V2   V3   V4   V5   V6   V7   V8
#> 1 -101 -102 -103 -104 -105 -106 -107 -108
#> 2 -102 -103 -104 -105 -106 -107 -108 -109
#> 3 -103 -104 -105 -106 -107 -108 -109 -110
#> 4 -104 -105 -106 -107 -108 -109 -110 -111
#> 5 -105 -106 -107 -108 -109 -110 -111 -112

Created on 2022-10-10 with reprex v2.0.2

Now with a sapply loop. You don't even have to pre-allocate space, the loop will do it on its own.

df2 <- t(sapply(1:nrow(mat1), \(j) myfcn(vec1, vals = c(mat1$a[j], mat1$b[j]))))
df2 <- as.data.frame(df2)

identical(df, df2)
#> [1] TRUE

Created on 2022-10-10 with reprex v2.0.2

Note also that if you need a tabular data structure but not specifically a data.frame, then the last as.data.frame is not needed.

Upvotes: 1

Related Questions