Reputation: 457
I want to get column names of a matrix to set another one, but if matrix does not have column names (or is set to NULL), the following code crashes my R session.
CharacterVector cn = colnames(x);
The following code is the way how I get column names of a matrix even if it does not have.
#include <Rcpp.h>
using namespace Rcpp;
// Get column names or empty
// [[Rcpp::export]]
CharacterVector get_colnames(const NumericMatrix &x) {
CharacterVector cn;
SEXP cnm = colnames(x);
if (!Rf_isNull(cnm)) cn = cnm;
return(cn);
}
Is there a more elegant way?
Upvotes: 1
Views: 875
Reputation: 368261
I had started this and then got distracted. @coatless covered it, this is simply shorter.
#include <Rcpp.h>
// [[Rcpp::plugins(cpp11)]]
using namespace Rcpp;
// [[Rcpp::export]]
CharacterVector getColnames(const NumericMatrix &x) {
size_t nc = x.cols();
SEXP s = x.attr("dimnames"); // could be nil or list
if (Rf_isNull(s)) { // no dimnames, need to construct names
CharacterVector res(nc);
for (size_t i=0; i<nc; i++) {
res[i] = std::string("V") + std::to_string(i);
}
return(res);
} else { // have names, return colnames part
List dn(s);
return(dn[1]);
}
}
/*** R
m <- matrix(1:9,3,3)
getColnames(m)
colnames(m) <- c("tic", "tac", "toe")
getColnames(m)
*/
R> Rcpp::sourceCpp("~/git/stackoverflow/55850510/answer.cpp")
R> m <- matrix(1:9,3,3)
R> getColnames(m)
[1] "V0" "V1" "V2"
R> colnames(m) <- c("tic", "tac", "toe")
R> getColnames(m)
[1] "tic" "tac" "toe"
R>
Upvotes: 3
Reputation: 20746
Few notes:
colnames()
or rownames()
set.
dimnames
. Rf_isNull()
.dimnames
is part of the attributes for the object.
dimnames
is null.Let's verify these the first point by first creating a matrix without names and then making one with names. Finally, we'll introduce a more verbose version of your function that tries to resolve a matrix without column names.
So, the traditional matrix construction would be:
x_no_names = matrix(1:4, nrow = 2)
x_no_names
#> [,1] [,2]
#> [1,] 1 3
#> [2,] 2 4
colnames(x_no_names)
#> NULL
rownames(x_no_names)
#> NULL
attributes(x_no_names)
#> $dim
#> [1] 2 2
So, there is no dimnames
for a matrix created without column or row names.
What happens if we assign column or rownames to the attributes?
# Create a matrix with names
x_named = x_no_names
colnames(x_named) = c("Col 1", "Col 2")
rownames(x_named) = c("Row 1", "Row 2")
# View attributes
attributes(x_named)
#> $dim
#> [1] 2 2
#>
#> $dimnames
#> $dimnames[[1]]
#> [1] "Row 1" "Row 2"
#>
#> $dimnames[[2]]
#> [1] "Col 1" "Col 2"
# View matrix object
x_named
#> Col 1 Col 2
#> Row 1 1 3
#> Row 2 2 4
Notice: The matrix
object now has a dimnames
attribute.
With our understanding of the matrix
structure, we can check:
dimnames
exist as an attribute on the matrix?dimnames
not NULL
?Note: This approach will make the original function a bit more verbose. The trade off is the function will avoid having to use a SEXP
return type.
#include <Rcpp.h>
// Get column names or empty
// [[Rcpp::export]]
Rcpp::CharacterVector get_colnames(const Rcpp::NumericMatrix &x) {
// Construct a character vector
Rcpp::CharacterVector cn;
// Create a numerical index for each column
Rcpp::IntegerVector a = Rcpp::seq_len(x.ncol());
// Coerce it to a character
Rcpp::CharacterVector b = Rcpp::as<Rcpp::CharacterVector>(a);
// Assign to character vector
cn = b;
if(x.hasAttribute("dimnames")) {
Rcpp::List dimnames = x.attr( "dimnames" ) ;
if(dimnames.size() != 2) {
Rcpp::stop("`dimnames` attribute must have a size of 2 instead of %s.", dimnames.size());
}
// Verify column names exist by checking for NULL
if(!Rf_isNull(dimnames[1]) ) {
// Retrieve colnames and assign to cn.
cn = dimnames[1];
} else {
// Assign to the matrix
colnames(x) = cn;
}
}
return(cn);
}
Calling the function would now give:
get_colnames(x_no_names)
#> [1] "1" "2"
get_colnames(x_named)
#> [1] "Col 1" "Col 2"
The first indicates we are using the generated indices whereas the second indicates were retrieving values.
Upvotes: 5