Reputation: 101
I would like to change the elements of a list of a ListMatrix in rcpp, but always failed to do that. Please see the the following toy example:
library("Rcpp")
cppFunction('
ListMatrix ListMatrixType(ListMatrix x){
NumericMatrix a = x(0,0);
a(0,0) = 100;
return x;
}
')
x = matrix(list(matrix(0,3,2)),2,2)
a = ListMatrixType(x)
a[[1,1]]
a[[2,2]]
I expect only a[[1,1]
is changed, but why the a[[2,2]]
is also changed?
> a[[1,1]]
[,1] [,2]
[1,] 100 0
[2,] 0 0
[3,] 0 0
> a[[2,2]]
[,1] [,2]
[1,] 100 0
[2,] 0 0
[3,] 0 0
I must misunderstand the indexing rule in rcpp. So my question is how to correctly change the elements of each list? I suppose each list contains a matrix.
Upvotes: 1
Views: 723
Reputation: 522
I don't think you misunderstand the indexing in Rcpp.
Not only a[[1,1]]
and a[[2,2]]
were changed, I believe a[[1,2]]
and a[[2,1]]
were also changed.
If we use another way to generate the matrix x
x = matrix(data = c(list(matrix(0,3,2)),
list(matrix(0,3,2)),
list(matrix(0,3,2)),
list(matrix(0,3,2))), 2, 2)
We will see a[[2,2]]
hasn't been changed.
> x = matrix(data = c(list(matrix(0,3,2)),
+ list(matrix(0,3,2)),
+ list(matrix(0,3,2)),
+ list(matrix(0,3,2))), 2, 2)
> a = ListMatrixType(x)
> a[[1,1]]
[,1] [,2]
[1,] 100 0
[2,] 0 0
[3,] 0 0
> a[[2,2]]
[,1] [,2]
[1,] 0 0
[2,] 0 0
[3,] 0 0
Upvotes: 1
Reputation: 20746
TL;DR: Welcome to shared environments and Rcpp's use of pointers.
There are two ways to address R's treatment of objects in a shared environment to avoid the domino update associated with the function.
To begin, let's look at the memory allocation for this shared memory list. For convenience -- and to avoid tracemem()
-- we'll use the lobstr
package to explore the shared environment. The package is currently GitHub-only and can be obtained using:
install.packages("devtools")
devtools::install_github("r-lib/lobstr")
With lobstr
in hand, let's take a look at the underlying memory addresses of the matrix list in R...
x = matrix(list(matrix(0,3,2)),2,2)
lobstr::obj_addrs(x)
# [1] "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
# ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
# identical memory addresses for all objects
Notice that all objects share the same memory location. Thus, R is treating each of these elements inside the list as being the same. R opts for this behavior to reduce the size of data in memory as each element belongs to the same shared environment.
This is particularly problematic for Rcpp as it is manipulating pointers, e.g. a location in memory that shows where a value is stored, and not values, e.g. a variable that holds a value like an int
or a double
.
Due to this pointer behavior, two actions occur:
x
is updated without requiring a return statement.If we modify your function slightly, the second point is more apparent.
Note: This function no longer has a return type, yet we are still seeing the x
object change in R's environment.
#include<Rcpp.h>
// [[Rcpp::export]]
void ListMatrixType_pointer(Rcpp::ListMatrix x){
Rcpp::NumericMatrix a = x(0, 0);
a(0, 0) = 100;
}
ListMatrixType_pointer(x)
str(x)
# List of 4
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# - attr(*, "dim")= int [1:2] 2 2
Notice, we do not have to return a value as x
is automatically updated. Furthermore, we still have the same memory location for each of the elements, e.g.
lobstr::obj_addrs(x)
# [1] "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
# ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
# identical memory addresses for all objects
To get around the identical memory addresses, you can deeply clone the object and then save it back into the ListMatrix
. The clone()
function instantiates a new block of memory to store the values to. Thus, modifying one element no longer will trigger a domino update. Furthermore, when you add a cloned object back to a list, the memory address changes for only that element.
clone()
to make a deep copy#include<Rcpp.h>
// [[Rcpp::export]]
void ListMatrixType_clone(Rcpp::ListMatrix x){
Rcpp::NumericMatrix a = x(0, 0);
// Perform a deep copy of a into b
// and, thus, changing the memory address
Rcpp::NumericMatrix b = Rcpp::clone(a);
b(0, 0) = 100;
// Update x with b
x(0, 0) = b;
}
Output
ListMatrixType_clone(x)
str(x)
# List of 4
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# - attr(*, "dim")= int [1:2] 2 2
lobstr::obj_addrs(x)
# [1] "0x7fceb811a7e8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
# ^^^^^^^ ^^^^^^^
# different memory addresses
To emphasize the need for unique elements, consider modifying the matrix in the first position of the matrix list, the memory address for only the first element will change; the rest of the addresses will remain the same. e.g.
x[[1, 1]] = matrix(1, 2, 2)
lobstr::obj_addrs(x)
# [1] "0x7fceb811a7e8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
# ^^^^^^^ ^^^^^^^
# different memory addresses
For further reading on this topic please see any of the linked posts below that emphasize pointer behavior associated with Rcpp (Disclaimer: I wrote them):
Note: object.size()
from utils
in Base R provides an incorrect calculation of the shared environment sizes. c.f. ?utils::object.size
This function merely provides a rough indication: it should be reasonably accurate for atomic vectors, but does not detect if elements of a list are shared, for example. (Sharing amongst elements of a character vector is taken into account, but not that between character vectors in a single object.)
lobstr::obj_size(x)
# 424 B
utils::object.size(x)
# 1304 bytes
So, due to the shared environment, the object size of the list is about 1/3.
Upvotes: 9