Reputation: 32988
I need to simple wrapper to serialize arbitrary R objects from within Rcpp code. Below a simplified version of my code:
// [[Rcpp::export]]
Rcpp::RawVector cpp_serialize(RObject x) {
Rcpp::Function serialize = Rcpp::Environment::namespace_env("base")["serialize"];
return serialize(x, R_NilValue);
}
This works great, however I found that for objects of class call
the call gets evaluated before being serialized. How can I prevent this from happening? I just want to mimic serialize()
in R.
# Works as intended
identical(serialize(iris, NULL), cpp_serialize(iris))
# Does not work: call is evaluated
call_object <- call("rnorm", 1000)
identical(serialize(call_object, NULL), cpp_serialize(call_object))
Update: I have a workaround in place (see below) but I am still very interested in a proper solution.
Rcpp::RawVector cpp_serialize(RObject x) {
Rcpp::Environment env;
env["MY_R_OBJECT"] = x;
Rcpp::ExpressionVector expr("serialize(MY_R_OBJECT, NULL)");
Rcpp::RawVector buf = Rcpp::Rcpp_eval(expr, env);
}
Upvotes: 4
Views: 181
Reputation: 368519
tl;dr: The question was How does one serialize to Raw vectors from C? The (compiled C) function serializeToRaw()
in the RApiSerialization package providing R's own serialization code. As the benchmark below shows, it is about three times faster than what was suggested above.
Longer Answer: I would not recommend mucking around with Rcpp::Function()
for this.. We do in fact provide a proper package for R which access to serialization: RApiSerialization. It does not do much, but it exports exactly two function to serialize, and deserialize, from and to RAW
which the RcppRedis package needs and uses.
So we can do the same here. I just called Rcpp.package.skeleton()
to have a package 'jeroen' created, added the LinkingTo:
and Imports:
to DESCRIPTION and the imports()
to NAMESPACE, and then this works:
#include <Rcpp.h>
#include <RApiSerializeAPI.h> // provides C API with serialization
// [[Rcpp::export]]
Rcpp::RawVector cpp_serialize(SEXP s) {
Rcpp::RawVector x = serializeToRaw(s); // from RApiSerialize
return x;
}
It is basically a simpler version of what you have above.
And we can call that as you do:
testJeroen <- function() {
## Works as intended
res <- identical(serialize(iris, NULL), cpp_serialize(iris))
## Didn't work above, works now
call_object <- call("rnorm", 1000)
res <- res &&
identical(serialize(call_object, NULL), cpp_serialize(call_object))
res
}
and lo and behold, it works:
R> library(jeroen)
Loading required package: RApiSerialize
R> testJeroen()
[1] TRUE
R>
So in short: if you don't want to muck with R, don't work with Rcpp::Function()
objects.
Benchmark: Using a simple
library(jeroen) # package containing both functions from here
library(microbenchmark)
microbenchmark(cpp=cpp_serialize(iris), # my suggestion
env=env_serialize(iris)) # OP's suggestion, renamed
we get
edd@max:/tmp/jeroen$ Rscript tests/quick.R
Loading required package: RApiSerialize
Unit: microseconds
expr min lq mean median uq max neval cld
cpp 17.471 22.1225 28.0987 24.4975 26.4795 420.001 100 a
env 85.028 91.0055 94.8772 92.9465 94.9635 236.710 100 b
edd@max:/tmp/jeroen$
showing that the answer by OP is nearly three times slower.
Upvotes: 1
Reputation: 21315
I think you've found an unexpected behavior in the Rcpp::Function
class. An MRE:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
RObject cpp_identity(RObject x) {
Rcpp::Function identity("identity");
return identity(x);
}
/*** R
quoted <- quote(print(1));
identity(quoted)
cpp_identity(quoted)
*/
gives
> quoted <- quote(print(1));
> identity(quoted)
print(1)
> cpp_identity(quoted)
[1] 1
[1] 1
This happens because Rcpp effectively performs this evaluation behind the scenes:
Rcpp_eval(Rf_lang2(Rf_install("identity"), x))
which is basically like
eval(call("identity", quoted))
but the call object is not 'protected' from evaluation.
Upvotes: 3