Reputation: 45
I have table that looks like this:
Row Col Value
1 1 31
1 2 56
1 8 13
2 1 83
2 2 51
2 9 16
3 2 53
I need to convert this table into matrix (Row
column represents rows and Col
column represents columns). For the output like this:
1 2 3 4 5 6 7 8 9
1 31 56 NA NA NA NA NA 13 NA
2 81 51 NA NA NA NA NA NA 16
3 NA 53 NA NA NA NA NA NA NA
I believe that there is quick way to do what I want as my solution would be looping for every row/column combination and cbind
everything.
Reproducible example:
require(data.table)
myTable <- data.table(
Row = c(1,1,1,2,2,2,3),
Col = c(1,2,8,1,2,9,1),
Value = c(31,56,13,83,51,16,53))
Upvotes: 3
Views: 132
Reputation: 66819
Sparse matrix. You probably want a sparse matrix
require(Matrix) # doesn't require installation
mySmat <- with(myTable,sparseMatrix(Row,Col,x=Value))
which gives
3 x 9 sparse Matrix of class "dgCMatrix"
[1,] 31 56 . . . . . 13 .
[2,] 83 51 . . . . . . 16
[3,] 53 . . . . . . . .
Matrix. If you really need a matrix
-class object with NA
s, there's
myMat <- as.matrix(mySmat)
myMat[myMat==0] <- NA
which gives
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 31 56 NA NA NA NA NA 13 NA
[2,] 83 51 NA NA NA NA NA NA 16
[3,] 53 NA NA NA NA NA NA NA NA
Efficiency considerations. For shorter code:
myMat <- with(myTable,as.matrix(sparseMatrix(Row,Col,x=Value)))
myMat[myMat==0] <- NA
For faster speed (but slower than creating a sparse matrix), initialize to NA
and then fill, as @jimmyb and @bgoldst do:
myMat <- with(myTable,matrix(,max(Row),max(Col)))
myMat[cbind(myTable$Row,myTable$Col)] <- myTable$Value
This workaround is only necessary if you insist on NA
s over zeros. A sparse matrix is almost certainly what you should use. Creating and working with it should be faster; and storing it should be less memory-intensive.
Upvotes: 4
Reputation: 35314
I believe the most concise and performant way to achieve this is to preallocate the matrix with NAs, and then assign a vector slice by manually computing the linear indexes from Row
and Col
:
df <- data.frame(Row=c(1,1,1,2,2,2,3), Col=c(1,2,8,1,2,9,2), Value=c(31,56,13,83,51,16,53) );
m <- matrix(NA,max(df$Row),max(df$Col));
m[(df$Col-1)*nrow(m)+df$Row] <- df$Value;
m;
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,] 31 56 NA NA NA NA NA 13 NA
## [2,] 83 51 NA NA NA NA NA NA 16
## [3,] NA 53 NA NA NA NA NA NA NA
Upvotes: 2
Reputation: 4387
Straightforward:
dat <- data.frame(
Row = c(1,1,1,2,2,2,3),
Col = c(1,2,8,1,2,9,1),
Value = c(31,56,13,83,51,16,53))
m = matrix(NA, nrow = max(dat$Row), ncol = max(dat$Col))
m[cbind(dat$Row, dat$Col)] = dat$Value
m
Upvotes: 4
Reputation: 193517
xtabs
in base R is perfect for this if you can live with "0" where you have NA
.
This would be the basic approach:
xtabs(Value ~ Row + Col, myTable)
# Col
# Row 1 2 8 9
# 1 31 56 13 0
# 2 83 51 0 16
# 3 53 0 0 0
However, that doesn't fill in the gaps, because not all factor levels are available. You can do this separately, or on-the-fly, like this:
xtabs(Value ~ factor(Row, sequence(max(Row))) +
factor(Col, sequence(max(Col))), myTable)
# factor(Col, sequence(max(Col)))
# factor(Row, sequence(max(Row))) 1 2 3 4 5 6 7 8 9
# 1 31 56 0 0 0 0 0 13 0
# 2 83 51 0 0 0 0 0 0 16
# 3 53 0 0 0 0 0 0 0 0
By extension, this means that if the "Row" and "Col" values are factors, dcast.data.table
should also work:
dcast.data.table(myTable, Row ~ Col, value.var = "Value", drop = FALSE)
(But it doesn't in my test for some reason. I had to do library(reshape2); dcast(myTable, Row ~ Col, value.var = "Value", drop = FALSE)
to get it to work, thus not taking advantage of "data.table" speed.)
Upvotes: 1