wolfsatthedoor
wolfsatthedoor

Reputation: 7303

appending columns by column matching ID in R

So what I am trying to do is difficult for me to articulate but very straightforward, and I can easily show you. The title is my best guess at the verbage, edits appeciated.

set.seed(1)
theta=matrix(rnorm(6,0,1),2,3)
M =  c(    0 ,    0   ,   0  ,      0,   1  ,
 1,      0   ,   0    ,  0 ,    1,    
          2 ,     0   ,   0   ,   0, 2,
          0  ,    1   ,   0   ,   0 ,2,
          1   ,   1    ,  0   ,   0, 3,
          0    ,  2   ,   0 ,     0, 3)

M = matrix(M, nrow = 6,ncol= 5,byrow=T)
        theta
              [,1]     [,2]     [,3]
[1,] 0.4418121 1.962053 2.236691
[2,] 1.0931398 1.273616 1.050373

M
  prod11 prod12 prod21 prod22 d
1      0      0      0      0 1
2      1      0      0      0 1
3      2      0      0      0 2
4      0      1      0      0 2
5      1      1      0      0 3
7      0      2      0      0 3


OUTPUT DESIRED

  prod11 prod12 prod21 prod22 d theta1    theta2
1      0      0      0      0 1 0.4418121 1.0931398
2      1      0      0      0 1 0.4418121 1.0931398
3      2      0      0      0 2 1.962053  1.273616
4      0      1      0      0 2 1.962053  1.273616
5      1      1      0      0 3 2.236691  1.050373
7      0      2      0      0 3 2.236691  1.050373

Upvotes: 2

Views: 136

Answers (3)

RHertel
RHertel

Reputation: 23788

We can use merge():

theta <- t(theta) #transpose matrix 
theta <- cbind(theta,seq(1:nrow(theta))) # add column "d" with row numbers                
colnames(theta) <- c("theta1","theta2","d") 
merge(M,theta)
#  d prod11 prod12 prod21 prod22    theta1   theta2
#1 1      0      0      0      0 0.4418121 1.093140
#2 1      1      0      0      0 0.4418121 1.093140
#3 2      2      0      0      0 1.9620530 1.273616
#4 2      0      1      0      0 1.9620530 1.273616
#5 3      1      1      0      0 2.2366910 1.050370
#6 3      0      2      0      0 2.2366910 1.050370

data

M <-  c(0 ,  0  ,  0  ,  0 , 1,
        1 ,  0  ,  0  ,  0 , 1,    
        2 ,  0  ,  0  ,  0 , 2,
        0 ,  1  ,  0  ,  0 , 2,
        1 ,  1  ,  0  ,  0 , 3,
        0 ,  2  ,  0  ,  0 , 3)
M <- as.data.frame(matrix(M, nrow = 6,ncol= 5,byrow=TRUE))
colnames(M) <- c( "prod11","prod12","prod21","prod22", "d")
theta <-matrix(c(0.4418121, 1.962053, 2.236691,1.0931398, 1.273616, 1.05037), byrow=TRUE, nrow=2)

Upvotes: 1

Pierre L
Pierre L

Reputation: 28441

With base R:

mat1 <- cbind(M, apply(theta, 1, function(x) x[M[, "d"]]))
colnames(mat1) <- c(colnames(M), paste0("theta", 1:nrow(theta)))
#      prod11 prod12 prod21 prod22 d       theta1     theta2
# [1,]      0      0      0      0 1 -0.893800723 -0.3073283
# [2,]      1      0      0      0 1 -0.893800723 -0.3073283
# [3,]      2      0      0      0 2 -0.004822422  0.9881641
# [4,]      0      1      0      0 2 -0.004822422  0.9881641
# [5,]      1      1      0      0 3  0.839750360  0.7053418
# [6,]      0      2      0      0 3  0.839750360  0.7053418

The core of the function is x[M[, "d"]]. As in Micheal's answer, we can subset one matrix by a vector in another. The vector is column "d" of M, M[, "d"]. If that column had a more randomized code we would set up a more robust lookup. But since it matches the column numbers of theta, we can use it directly.

I wrapped it with apply as it works well with matrices. The second argument 1 indicates that the function should be carried out row-wise ( equivalent to theta[1, ] and theta[2, ] and so on. If I chose 2, x would be equivalent to theta[ ,1] and so on.

To match the column names to the desired output we use colnames (a possible pitfall is to attempt names() which works with data frames).

Upvotes: 4

MichaelChirico
MichaelChirico

Reputation: 34703

I would use data.table:

setDT(M)
M[, paste0("theta",1:2) := as.data.table(t(theta[, d]))]
> M
   V1 V2 V3 V4 V5     theta1     theta2
1:  0  0  0  0  1 -1.2341141  0.4675928
2:  1  0  0  0  1 -1.2341141  0.4675928
3:  2  0  0  0  2 -0.6186437  1.5602801
4:  0  1  0  0  2 -0.6186437  1.5602801
5:  1  1  0  0  3  0.1233480 -0.3746259
6:  0  2  0  0  3  0.1233480 -0.3746259

We need as.data.table or as.data.frame because as.list destroys the dimensions of the matrix result and := will just unlist what comes out of t(theta[, d])


If M is really stored as a matrix (unclear since you haven't named its dimensions), I recommend you store it as a data.table (or data.frame) by using M <- data.table(M).

For completeness sake, here's a solution purely in matrix notation:

M <- cbind(M, t(theta[, M[, "d"]]))

Upvotes: 5

Related Questions