Reputation: 2394
I was doing some optimization by removing one step from the process:
> library(microbenchmark)
> microbenchmark(paste0("this","and","that"))
Unit: microseconds
expr min lq mean median uq max neval
paste0("this", "and", "that") 2.026 2.027 3.50933 2.431 2.837 34.038 100
> microbenchmark(.Internal(paste0(list("this","and","that"),NULL)))
Unit: microseconds
expr min lq mean median uq max neval
.Internal(paste0(list("this", "and", "that"), NULL)) 1.216 1.621 2.77596 2.026 2.027 43.764 100
So far so good.
But then after I noticed that list
was defined as
function (...) .Primitive("list")
I tried to further "simplify"
> microbenchmark(.Internal(paste0(.Primitive("list")("this","and","that"),NULL)))
Unit: microseconds
expr min lq mean median uq max neval
.Internal(paste0(.Primitive("list")("this", "and", "that"), NULL)) 3.241 3.242 4.66433 3.647 3.648 80.638 100
and the time increases!
my guess is that processing the string "list"
is the source of the problem, and that it's handled differently within the actual calling of the function list
but how?
disclaimer: I know this hurts readability more than it helps performance. This is just for some very simple functions that will not change and are used so often that slight performance issues are desired even at this cost.
Edit in response to Josh O'Brien's comment:
I'm not sure what this says about his idea, but
library(compiler)
ff <- compile(function(...){.Internal(paste0(.Primitive("list")("this","and","that"),NULL))})
ff2 <- compile(function(...){.Internal(paste0(list("this","and","that"),NULL))})
microbenchmark(eval(ff),eval(ff2),times=10000)
> microbenchmark(eval(ff2),eval(ff),times=10000)
Unit: microseconds
expr min lq mean median uq max neval
eval(ff2) 1.621 2.026 2.356761 2.026 2.431 144.257 10000
eval(ff) 1.621 2.026 2.455913 2.026 2.431 89.148 10000
and looking at the plot generated from microbenchmark (just wrap it with plot()
to see it yourself) running that a bunch of times, it appears that those have statistically identical performance, despite that "max" value looking like ff2 has a worse worst-case. I don't know what to make of that, but maybe that will help someone. So all that basically says that they compile to identical code. Does that mean his comment is the answer?
Upvotes: 14
Views: 400
Reputation: 9696
The R interpreter has hardcoded optimizations for common functions, and this goes deeper than byte compiling:
> list2 <- list
> list3 <- cmpfun(list2)
> microbenchmark(
+ list(1,2),
+ list2(1,2),
+ list3(1,2)
+ )
Unit: nanoseconds
expr min lq mean median uq max neval
list(1, 2) 576 620.5 654.53 640.0 675.5 941 100
list2(1, 2) 619 702.0 1123.43 728.0 761.0 39045 100
list3(1, 2) 617 683.0 735.83 715.5 759.0 1964 100
Here's what the SEXPs look like. Note the metadata on "list"
> .Internal(inspect(quote(list(1,2))))
@23b0ed0 06 LANGSXP g0c0 [NAM(2)]
@1ed8f48 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "list" (has value)
@2c7adf8 14 REALSXP g0c1 [] (len=1, tl=0) 1
@2c7adc8 14 REALSXP g0c1 [] (len=1, tl=0) 2
list2
is missing some metadata:
> list2 <- list
> .Internal(inspect(quote(list2(1,2))))
@23b1578 06 LANGSXP g0c0 [NAM(2)]
@23b0a70 01 SYMSXP g0c0 [] "list2"
@2c7ad08 14 REALSXP g0c1 [] (len=1, tl=0) 1
@2c7acd8 14 REALSXP g0c1 [] (len=1, tl=0) 2
.Primitive("list")
is a more complicated expression:
> .Internal(inspect(quote(.Primitive("list")(1,2))))
@297e748 06 LANGSXP g0c0 [NAM(2)]
@297d9a0 06 LANGSXP g0c0 []
@1ec4530 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] ".Primitive" (has value)
@2c7a888 16 STRSXP g0c1 [] (len=1, tl=0)
@1ed5588 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "list"
@2c7a858 14 REALSXP g0c1 [] (len=1, tl=0) 1
@2c7a828 14 REALSXP g0c1 [] (len=1, tl=0) 2
Upvotes: 0
Reputation: 176718
The reason .Internal(paste0(.Primitive("list")("this","and","that"),NULL))
is slower seems to be because of what Josh O'Brien guessed. Calling .Primitive("list")
directly incurs some additional overhead.
You can see the effects via a simple example:
require(compiler)
pl <- cmpfun({.Primitive("list")})
microbenchmark(list(), .Primitive("list")(), pl())
# Unit: nanoseconds
# expr min lq median uq max neval
# list() 63 98.0 112.0 140.5 529 100
# .Primitive("list")() 4243 4391.5 4486.5 4606.0 16077 100
# pl() 79 135.5 148.0 175.5 39108 100
That said, you're not going to be able to improve the speed of .Primitive
and .Internal
from the R prompt. They are both entry points to C code.
And there's no reason to try and replace a call to .Primitive
with .Internal
. That's recursive, since .Internal
is itself a primitive.
> .Internal
function (call) .Primitive(".Internal")
You'll get the same slowness if you try to call .Internal
"directly"... and a similar "speedup" if you compile the "direct" call.
Internal. <- function() .Internal(paste0(list("this","and","that"),NULL))
Primitive. <- function() .Primitive(".Internal")(paste0("this","and","that"),NULL)
cPrimitive. <- cmpfun({Primitive.})
microbenchmark(Internal., Primitive., cPrimitive., times=1e4)
# Unit: nanoseconds
# expr min lq median uq max neval
# Internal. 26 27 27 28 1057 10000
# Primitive. 28 32 32 33 2526 10000
# cPrimitive. 26 27 27 27 1706 10000
Upvotes: 11