vonjd
vonjd

Reputation: 4380

How to optimize recursive function for finding all permutations?

I wrote the following piece of code to find all permutations of a given vector:

perm <- function(v, r = NULL, P = NULL) {
  l <- length(v)
  if (l == 0) {
    P <- rbind(P, r)
    rownames(P) <- NULL
    P
  } else {
    for (i in 1:l) {
      new_r <- c(r, v[i])
      new_v <- v[-i]
      P <- perm(new_v, new_r, P)
    }
    P
  }
}

P <- perm(1:9) # takes "forever" yet e.g. perm(1:7) is quite fast!?!
P

It does what it should but the problem is that it kind of runs forever if one uses vectors of length > 8 (as above).

My question
I don't really see the problem, I found some recursive implementations that don't look so different yet are much more efficient... So is there a simple way to optimize the code so that it runs faster?

Upvotes: 2

Views: 221

Answers (2)

Joseph Wood
Joseph Wood

Reputation: 7597

As @akrun states, recursion in R is generally not that efficient. However, if you must have a recursive solution, look no further than gtools::permutations. Here is the implementation:

permGtools <- function(n, r, v) {
    if (r == 1) 
        matrix(v, n, 1)
    else if (n == 1) 
        matrix(v, 1, r)
    else {
        X <- NULL
        for (i in 1:n) X <- rbind(X, cbind(v[i], permGtools(n - 1, r - 1, v[-i])))
        X
    }
}

By the way, to get the full source code, simply type gtools::permutations in the console and hit enter. For more information see How can I view the source code for a function?

And here are some timings:

system.time(perm(1:8))
  user  system elapsed 
34.074  10.641  44.815

system.time(permGtools(8,8,1:8))
 user  system elapsed 
0.253   0.001   0.255

And just for good measure:

system.time(permGtools(9, 9, 1:9))
 user  system elapsed 
2.512   0.046   2.567


Why is the OP's implementation slower?

Skip to the summary if you don't to read the details.

For starters, we can simply see that the OP's implementation makes more recursive calls than the implementation in gtools. To show this, we add count <<- count + 1L to the top of each function (N.B. We are using the <<- assignment operator which searches through the parent environments first). E.g:

permGtoolsCount <- function(n, r, v) {
    count <<- count + 1L
    if (r == 1)
    .
    .

And now we test a few lengths:

iterationsOP <- sapply(4:7, function(x) {
    count <<- 0L
    temp <- permCount(1:x)
    count
})

iterationsOP
[1]    65   326  1957 13700

iterationsGtools <- sapply(4:7, function(x) {
    count <<- 0L
    temp <- permGtoolsCount(x, x, 1:x)
    count
})

iterationsGtools
[1]   41  206 1237 8660

As you can see, the OP's implementation makes more calls in every case. In fact, it makes about 1.58... times the amount of recursive calls.

iterationsOP / iterationsGtools
[1] 1.585366 1.582524 1.582053 1.581986

As we have stated already, recursion in R has a bad reputation. I couldn't find anything pinpointing exactly why this is the case other than R does not employ tail-recursion.

At this point, it seems hard to believe that making about 1.58 times more recursive calls would explain the 175 times speed up we saw above (i.e. 44.815 / 0.255 ~= 175).

We can profile the code with Rprof in order to glean more information:

Rprof("perm.out", memory.profiling = TRUE)
a1 <- perm(1:8)
Rprof(NULL)
summaryRprof("perm.out", memory = "both")$by.total
             total.time total.pct mem.total self.time self.pct
"perm"            43.42    100.00   15172.1      0.58     1.34
"rbind"           22.50     51.82    7513.7     22.50    51.82
"rownames<-"      20.32     46.80    7388.7     20.30    46.75
"c"                0.02      0.05      23.7      0.02     0.05
"length"           0.02      0.05       0.0      0.02     0.05



Rprof("permGtools.out", memory.profiling = TRUE)
a2 <- permGtools(8, 8, 1:8)
Rprof(NULL)
summaryRprof("permGtools.out", memory = "tseries")$by.total
             total.time total.pct mem.total self.time self.pct
"rbind"            0.34    100.00     134.8      0.18    52.94
"cbind"            0.34    100.00     134.8      0.08    23.53
"permGtools"       0.34    100.00     134.8      0.06    17.65
"matrix"           0.02      5.88       0.0      0.02     5.88

One thing that jumps out immediately (other than the time) is the huge memory usage of the OP's implementation. The OP's implementation uses roughly 15 Gb of memory whereas the gtools implementation only use 134 Mb.

Digging Deeper

In the above, we are simply looking at memory usage in a general view by setting the memory parameter to both. There is another setting called tseries that lets you look at the memory usage over time.

head(summaryRprof("perm.out", memory = "tseries"))
     vsize.small vsize.large    nodes duplications       stack:2
0.02     4050448    25558992 49908432         2048 "perm":"perm"
0.04       98808    15220400  1873760          780 "perm":"perm"
0.06       61832    12024184  1173256          489 "perm":"perm"
0.08       45400           0   861728          358 "perm":"perm"
0.1            0    14253568        0          495 "perm":"perm"
0.12       75752    21412320  1436120          599 "perm":"perm"

head(summaryRprof("permGtools.out", memory = "tseries"))
     vsize.small vsize.large    nodes duplications              stack:2
0.02     4685464    39860824 43891512            0 "permGtools":"rbind"
0.04      542080      552384 12520256            0 "permGtools":"rbind"
0.06           0           0        0            0 "permGtools":"rbind"
0.08      767992     1200864 17740912            0 "permGtools":"rbind"
0.1       500208      566592 11561312            0 "permGtools":"rbind"
0.12           0      151488        0            0 "permGtools":"rbind"

There is a lot going on here, but the thing to focus on is the duplications field. From the documentation for summaryRprof we have:

It also records the number of calls to the internal function duplicate in the time interval. duplicate is called by C code when arguments need to be copied.

Comparing the number of copies in each implementation:

sum(summaryRprof("perm.out", memory = "tseries")$duplications)
[1] 121006

sum(summaryRprof("permGtools.out", memory = "tseries")$duplications)
[1] 0

So we see that the OP's implementation requires many copies to be made. I guess this isn't surprising given that the desired object is a parameter in the function prototype. That is, P is the matrix of permutations that is to be returned and is constantly getting larger and larger with each iteration. And with each iteration, we are passing it along to perm. You will notice in the gtools implementation that this is not the case as it simply as two numeric values and a vector for its parameters.

Summary

So there you have it, the OP's original implementation not only makes more recursive calls, but also require many copies which in turn bogs down the memory for drastic blows to efficiency.

Upvotes: 2

akrun
akrun

Reputation: 887078

It may be better to use permGeneral from RcppAlgos

P <- perm(1:5) # OP's function

library(RcppAlgos)
P1 <- permuteGeneral(5, 5)
all.equal(P, P1, check.attributes = FALSE)
#[1] TRUE  

Benchmarks

On a slightly longer sequence

system.time({

   P2 <- permuteGeneral(8, 8)
  })
#user  system elapsed 
#  0.001   0.000   0.001 

system.time({

   P20 <- perm(1:8) #OP's function
})
# user  system elapsed 
# 31.254  11.045  42.226 

all.equal(P2, P20, check.attributes = FALSE)
#[1] TRUE

Generally, recursive function can take longer time as recursive calls to the function takes more execution time

Upvotes: 1

Related Questions