TrainedMusician
TrainedMusician

Reputation: 97

R Looping through two vectors

Good day,

I need a function that creates increasing ID's for two parameters. I came up with this function which works fine, but I want it to be vectorized and I cannot seem to avoid a Big O factor of N². Are there any 'better' ways to do this?

Standard function:

threshold <- 3

calculateID <- function(p, r) {
    return((p-1) * threshold + r)
}

calculateID(1, 1) #returns 1
calculateID(1, 2) #returns 2
calculateID(1, 3) #returns 3
calculateID(2, 1) #returns 4
#.....
calculateID(5, 3) #returns 15

Vectorized function, I would like to give the two parameters as vectors so the function only has to be called once:

threshold <- 3
calculateIDVectorized <- function(p, r) {
    return(unlist(
        lapply(p, function(x) {
            lapply(r, function(y) {
                (x-1) * threshold + y
            })
        })
    ))
}

calculateIDVectorized(c(1, 2, 3, 4, 5), c(1, 2, 3)) # should return 1-15

To clarify: I want that every p and r argument is used so you should always get a result of length(p * r)

Upvotes: 4

Views: 85

Answers (4)

Brian Fisher
Brian Fisher

Reputation: 1367

Since the OP was interested in fast computation, I compared the solutions:

library(microbenchmark)

p <- c(1:500) # using larger data set
r <- c(1:20)

threshhold = length(r) # parameterizing threshold

m = microbenchmark(
tidy= crossing(p, r) %>% 
      rowwise %>% 
      transmute(out = calculateID(p, r)) %>%
      pull(out),

dcv = do.call(Vectorize(calculateID),unname(rev(expand.grid(r,p)))),

numbering = rev(expand.grid(r,p)) %>%
      arrange(Var2, Var1) %>%
      transmute(out = row_number()) %>%
      pull(out),

hybrid = rev(expand.grid(r,p)) %>%
      rowwise() %>%
      transmute(out = calculateID(Var2, Var1)) %>%
      pull(out),

outer = as.vector(t(outer(p, r, calculateID))),

outer_c = c(t(outer(p, r, calculateID))),

david = rep((p - 1), each = length(r)) * threshold + r
)
m
# Unit: microseconds
# expr       min        lq       mean     median         uq        max neval
# tidy 45441.869 47370.776 52123.6770 49482.1970 54158.4285 116780.840   100
# dcv 16259.935 17156.225 19641.6731 17897.8885 21576.0865  55489.586   100
# numbering  5947.147  6379.337  7127.5125  6576.3560  6952.3205  12005.854   100
# hybrid 44124.099 45856.210 51531.9480 47642.5405 52225.0600 175778.380   100
# outer   106.655   120.711   141.1137   128.9665   143.2465    265.072   100
# outer_c   117.811   137.446   152.5958   142.1315   155.9650    327.101   100
# david   223.125   230.711   257.5622   241.8675   260.6100    920.164   100

enter image description here

So it looks like the options using outer() are fastest with as.vector() edging out c(). @DavidArenburg's solution is also right up with the solutions using outer().

I added a hybrid option using dplyr::transmute() because rev(expand.grid()) was significantly faster thatn crossing(), which appears to be marginally faster than the straight dplyr route, but still not as fast as the do.call(Vectorize... or the others.

another option (added above) would be to arrange the data frame and create id's using dplyr::row_number() or 1:nrow(). This option would work if all the combinations for p and r are present and unique, but would fail with non-sequential values.

Upvotes: 2

akrun
akrun

Reputation: 887213

An option with tidyverse

library(dplyr)
library(tidyr)
crossing(p, r) %>% 
     rowwise %>% 
     transmute(out = calculateID(p, r)) %>%
     pull(out)
#[1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

Upvotes: 0

ThomasIsCoding
ThomasIsCoding

Reputation: 101663

Another base R option using do.call + Vectorize + expand.grid

> do.call(Vectorize(calculateID),unname(rev(expand.grid(r,p))))
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

Data

p <- c(1, 2, 3, 4, 5)
r <- c(1, 2, 3)

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 173868

You can use outer:

calculateIDVectorized <- function(p, r) as.vector(t(outer(p, r, calculateID)))

calculateIDVectorized(c(1, 2, 3, 4, 5), c(1, 2, 3))
#> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

Upvotes: 3

Related Questions