Reputation: 7464
If we look at any function written in C in R base source code we can see that the code is quite simple, e.g.
#include "nmath.h"
#include "dpq.h"
double dexp(double x, double scale, int give_log)
{
#ifdef IEEE_754
/* NaNs propagated correctly */
if (ISNAN(x) || ISNAN(scale)) return x + scale;
#endif
if (scale <= 0.0) ML_ERR_return_NAN;
if (x < 0.)
return R_D__0;
return (give_log ?
(-x / scale) - log(scale) :
exp(-x / scale) / scale);
}
While the function is vectorized if called from R:
dexp(rep(1, 5), 1:2)
## [1] 0.3678794 0.2706706 0.3678794 0.2706706 0.3678794
How is it so? What makes it properly vectorized?
Upvotes: 3
Views: 125
Reputation: 46876
Actually, you're not quoting the code that defines the R function. In R we see
> dexp
function (x, rate = 1, log = FALSE)
.Call(C_dexp, x, 1/rate, log)
<bytecode: 0x3582cf8>
<environment: namespace:stats>
which tells us that the dexp() function is defined in the stats package. With a little digging we see in the stats NAMESPACE
useDynLib(stats, .registration = TRUE, .fixes = "C_")
which tells us that the prefix C_ is added to the function name when exposed to R. So we're looking for a C function called dexp
in the stats package. There are two relevant entries
init.c:147: CALLDEF_MATH2_1(dexp),
distn.c:144:DEFMATH2_1(dexp)
the first is a macro making the C function available to R, the second is a macro defining the function. For the later, the macro is defined as
#define DEFMATH2_1(name) \
SEXP do_##name(SEXP sa, SEXP sb, SEXP sI) { \
return math2_1(sa, sb, sI, name); \
}
which tells us to look for the math2_1 function, a little further up the file
static SEXP math2_1(SEXP sa, SEXP sb, SEXP sI, double (*f)(double, double, int))
{
SEXP sy;
R_xlen_t i, ia, ib, n, na, nb;
double ai, bi, *a, *b, *y;
int m_opt;
int naflag;
if (!isNumeric(sa) || !isNumeric(sb))
error(R_MSG_NONNUM_MATH);
SETUP_Math2;
m_opt = asInteger(sI);
mod_iterate(na, nb, ia, ib) {
// if ((i+1) % NINTERRUPT) R_CheckUserInterrupt();
ai = a[ia];
bi = b[ib];
if_NA_Math2_set(y[i], ai, bi)
else {
y[i] = f(ai, bi, m_opt);
if (ISNAN(y[i])) naflag = 1;
}
}
FINISH_Math2;
return sy;
} /* math2_1() */
That mod_iterate()
call is actually another macro
#define mod_iterate(n1,n2,i1,i2) for (i=i1=i2=0; i<n; \
i1 = (++i1 == n1) ? 0 : i1,\
i2 = (++i2 == n2) ? 0 : i2,\
++i)
and you can see, at least with the eye of faith, that the C code is implementing a loop, so the magic implied by the original question (that compact C code results in vectorized calculations) is not there -- it is an iteration at the C level.
math2_1
takes as it's final argument a function, which in this case is the dexp
quoted in the original question. This function is applied to each element in the iteration.
Upvotes: 6