Reputation: 2186
I would like to generate a data frame in an Rcpp function which contains a list column. I have tried several things and have been unable to find a working solution. The following is a Rcpp c++ file where I have attempted this:
#include <Rcpp.h>
#include <vector>
using namespace Rcpp;
using namespace std;
// [[Rcpp::export]]
DataFrame makeListColumn() {
vector<RawVector> the_future_list;
the_future_list.push_back(RawVector::create(0, 1, 2));
the_future_list.push_back(RawVector::create(3, 4));
the_future_list.push_back(RawVector::create(5, 6, 7, 8, 9, 10));
vector<int> another_regular_column;
another_regular_column.push_back(42);
another_regular_column.push_back(24);
another_regular_column.push_back(4242);
DataFrame ret = DataFrame::create(Named("another_regular_column") = another_regular_column, Named("thelistcol") = the_future_list);
return ret;
}
/*** R
a = makeListColumn()
dput(a)
*/
The output from this is the following:
a = makeListColumn()
structure(list(another_regular_column = c(42L, 24L, 4242L, 42L, 24L, 4242L), thelistcol.as.raw.c.0x00..0x01..0x02.. = as.raw(c(0x00, 0x01, 0x02, 0x00, 0x01, 0x02)), thelistcol.as.raw.c.0x03..0x04.. = as.raw(c(0x03, 0x04, 0x03, 0x04, 0x03, 0x04)), thelistcol.as.raw.c.0x05..0x06..0x07..0x08..0x09..0x0a.. = as.raw(c(0x05, 0x06, 0x07, 0x08, 0x09, 0x0a))), class = "data.frame", row.names = c(NA, -6L))
What I am looking for is the following (done in a regular R script):
what_i_wanted = data.frame(
another_regular_column = c(42, 24, 4242),
thelistcol = I(list(as.raw(c(0,1,2)), as.raw(c(3, 4)), as.raw(c(5, 6, 7, 8, 9, 10))))
)
This produces the output:
structure(list(another_regular_column = c(42, 24, 4242), thelistcol = structure(list( as.raw(c(0x00, 0x01, 0x02)), as.raw(c(0x03, 0x04)), as.raw(c(0x05, 0x06, 0x07, 0x08, 0x09, 0x0a))), class = "AsIs")), class = "data.frame", row.names = c(NA, -3L))
The primary difference is between the R and the Rcpp approach is the I()
call in the R code. If I remove that, the R code generates the same structure as the Rcpp code. I did some looking in the Rcpp documentation and did some google searches, but have come up empty handed.
Can somebody help me understand what I need to do in Rcpp to get this to work?
EDIT:
I did try to do something like:
List the_list = List::create(the_future_list);
the_list.attr("class") = CharacterVector::create("AsIs");
This unfortunately resulted in the following error:
Error in makeListColumn() : Could not convert using R function: as.data.frame.
Upvotes: 2
Views: 385
Reputation: 20746
AsIs
isn't implemented.
The best way to handle working with list
columns in a data.frame
within C++ is to use Rcpp::List
to handle the construction. Recall that a data.frame
is a list
with an observation count enforcement. In addition, we can modify the Rcpp::List
object attributes -- unlike a std
data structure -- to include the AsIs
flag.
In short, this looks like:
#include <Rcpp.h>
// [[Rcpp::export]]
SEXP makeListColumn() {
// ^ Changed from Rcpp::DataFrame to a general SEXP object.
// Store inside of an Rcpp List
Rcpp::List the_future_list(3);
the_future_list[0] = Rcpp::RawVector::create(0, 1, 2);
the_future_list[1] = Rcpp::RawVector::create(3, 4);
the_future_list[2] = Rcpp::RawVector::create(5, 6, 7, 8, 9, 10);
// Mark with AsIs
the_future_list.attr("class") = "AsIs";
// Store inside of a regular vector
std::vector<int> another_regular_column;
another_regular_column.push_back(42);
another_regular_column.push_back(24);
another_regular_column.push_back(4242);
// Construct a list
Rcpp::List ret = Rcpp::List::create(
Rcpp::Named("another_regular_column") = another_regular_column,
Rcpp::Named("thelistcol") = the_future_list);
// Coerce to a data.frame
ret.attr("class") = "data.frame";
ret.attr("row.names") = Rcpp::seq(1, another_regular_column.size());
// Return the data.frame
return ret;
}
Most importantly, note we forgo the Rcpp::DataFrame
class and return a SEXP
object. Moreover, we coerce an Rcpp::List
to an Rcpp::DataFrame
by changing its class
and assigning row.names
to the object.
In practice, the code returns:
a = makeListColumn()
str(a)
# 'data.frame': 3 obs. of 2 variables:
# $ another_regular_column: int 42 24 4242
# $ thelistcol :List of 3
# ..$ : raw 00 01 02
# ..$ : raw 03 04
# ..$ : raw 05 06 07 08 ...
# ..- attr(*, "class")= chr "AsIs"
Compared to the desired result:
what_i_wanted = data.frame(
another_regular_column = c(42, 24, 4242),
thelistcol = I(list(as.raw(c(0,1,2)), as.raw(c(3, 4)), as.raw(c(5, 6, 7, 8, 9, 10))))
)
str(what_i_wanted)
# 'data.frame': 3 obs. of 2 variables:
# $ another_regular_column: num 42 24 4242
# $ thelistcol :List of 3
# ..$ : raw 00 01 02
# ..$ : raw 03 04
# ..$ : raw 05 06 07 08 ...
# ..- attr(*, "class")= chr "AsIs"
all.equal(a, what_i_wanted)
# [1] TRUE
Upvotes: 4