user7283235
user7283235

Reputation: 101

How to correctly change a list of a ListMatrix in rcpp in R

I would like to change the elements of a list of a ListMatrix in rcpp, but always failed to do that. Please see the the following toy example:

library("Rcpp")
cppFunction('
ListMatrix ListMatrixType(ListMatrix x){
            NumericMatrix a = x(0,0);
            a(0,0) = 100;
            return x;
            }
            ')
x = matrix(list(matrix(0,3,2)),2,2)
a = ListMatrixType(x)
a[[1,1]]
a[[2,2]]

I expect only a[[1,1] is changed, but why the a[[2,2]] is also changed?

> a[[1,1]]
     [,1] [,2]
[1,]  100    0
[2,]    0    0
[3,]    0    0
> a[[2,2]]
     [,1] [,2]
[1,]  100    0
[2,]    0    0
[3,]    0    0

I must misunderstand the indexing rule in rcpp. So my question is how to correctly change the elements of each list? I suppose each list contains a matrix.

Upvotes: 1

Views: 723

Answers (2)

Qiang Kou
Qiang Kou

Reputation: 522

I don't think you misunderstand the indexing in Rcpp.

Not only a[[1,1]] and a[[2,2]] were changed, I believe a[[1,2]] and a[[2,1]] were also changed.

If we use another way to generate the matrix x

x = matrix(data = c(list(matrix(0,3,2)),
                    list(matrix(0,3,2)),
                    list(matrix(0,3,2)),
                    list(matrix(0,3,2))), 2, 2)

We will see a[[2,2]] hasn't been changed.

> x = matrix(data = c(list(matrix(0,3,2)),
+                     list(matrix(0,3,2)),
+                     list(matrix(0,3,2)),
+                     list(matrix(0,3,2))), 2, 2)
> a = ListMatrixType(x)
> a[[1,1]]
     [,1] [,2]
[1,]  100    0
[2,]    0    0
[3,]    0    0
> a[[2,2]]
     [,1] [,2]
[1,]    0    0
[2,]    0    0
[3,]    0    0

Upvotes: 1

coatless
coatless

Reputation: 20746

TL;DR: Welcome to shared environments and Rcpp's use of pointers.

There are two ways to address R's treatment of objects in a shared environment to avoid the domino update associated with the function.

  1. perform a deep copy of the object and then update the matrix list position or
  2. have only unique elements inside the list (e.g. no duplicated matrices)

To begin, let's look at the memory allocation for this shared memory list. For convenience -- and to avoid tracemem() -- we'll use the lobstr package to explore the shared environment. The package is currently GitHub-only and can be obtained using:

install.packages("devtools")
devtools::install_github("r-lib/lobstr")

With lobstr in hand, let's take a look at the underlying memory addresses of the matrix list in R...

x = matrix(list(matrix(0,3,2)),2,2)
lobstr::obj_addrs(x)
# [1] "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
#      ^^^^^^^^^^^^^^   ^^^^^^^^^^^^^^ 
# identical memory addresses for all objects

Notice that all objects share the same memory location. Thus, R is treating each of these elements inside the list as being the same. R opts for this behavior to reduce the size of data in memory as each element belongs to the same shared environment.

This is particularly problematic for Rcpp as it is manipulating pointers, e.g. a location in memory that shows where a value is stored, and not values, e.g. a variable that holds a value like an int or a double.

Due to this pointer behavior, two actions occur:

  1. all of the matrices in the matrix list are updated simultaneously, and
  2. the overall object x is updated without requiring a return statement.

If we modify your function slightly, the second point is more apparent.

Modified Function to Illustrate Pointers

Note: This function no longer has a return type, yet we are still seeing the x object change in R's environment.

#include<Rcpp.h>

// [[Rcpp::export]]
void ListMatrixType_pointer(Rcpp::ListMatrix x){
  Rcpp::NumericMatrix a = x(0, 0);
  a(0, 0) = 100;
}

Output

ListMatrixType_pointer(x)
str(x)
# List of 4
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# - attr(*, "dim")= int [1:2] 2 2

Notice, we do not have to return a value as x is automatically updated. Furthermore, we still have the same memory location for each of the elements, e.g.

lobstr::obj_addrs(x)
# [1] "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
#      ^^^^^^^^^^^^^^   ^^^^^^^^^^^^^^ 
# identical memory addresses for all objects

Deep Copies of Matrix

To get around the identical memory addresses, you can deeply clone the object and then save it back into the ListMatrix. The clone() function instantiates a new block of memory to store the values to. Thus, modifying one element no longer will trigger a domino update. Furthermore, when you add a cloned object back to a list, the memory address changes for only that element.

Using clone() to make a deep copy

#include<Rcpp.h>

// [[Rcpp::export]]
void ListMatrixType_clone(Rcpp::ListMatrix x){
  Rcpp::NumericMatrix a = x(0, 0);

  // Perform a deep copy of a into b
  // and, thus, changing the memory address
  Rcpp::NumericMatrix b = Rcpp::clone(a);
  b(0, 0) = 100;

  // Update x with b
  x(0, 0) = b;
}
Output
ListMatrixType_clone(x)
str(x)
# List of 4
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# - attr(*, "dim")= int [1:2] 2 2

lobstr::obj_addrs(x)
# [1] "0x7fceb811a7e8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
#             ^^^^^^^          ^^^^^^^ 
#            different memory addresses

Unique Elements in a List

To emphasize the need for unique elements, consider modifying the matrix in the first position of the matrix list, the memory address for only the first element will change; the rest of the addresses will remain the same. e.g.

x[[1, 1]] = matrix(1, 2, 2)
lobstr::obj_addrs(x)
# [1] "0x7fceb811a7e8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
#             ^^^^^^^          ^^^^^^^ 
#            different memory addresses

Additional Resources

For further reading on this topic please see any of the linked posts below that emphasize pointer behavior associated with Rcpp (Disclaimer: I wrote them):

Miscellaneous note on Object Sizes

Note: object.size() from utils in Base R provides an incorrect calculation of the shared environment sizes. c.f. ?utils::object.size

This function merely provides a rough indication: it should be reasonably accurate for atomic vectors, but does not detect if elements of a list are shared, for example. (Sharing amongst elements of a character vector is taken into account, but not that between character vectors in a single object.)

lobstr::obj_size(x)
# 424 B
utils::object.size(x)
# 1304 bytes

So, due to the shared environment, the object size of the list is about 1/3.

Upvotes: 9

Related Questions