Reputation: 945
I am a beginner in R. I have two matrices with the same number of rows (let's say 10), and many columns. I want to do a linear regression, using glm, between eacg row of matrixA and the corresponding row of matrixB. I want to print the residuals in a new matrix, which will have the same number of rows as the original matrices:
matrixA <- read.table("fileA.txt", header=TRUE, row.names=1)
matrixB <- read.table("fileB.txt", header=TRUE, row.names=1)
for(i in 1:10) {
response = as.matrix(matrixA)[i,]
predictor = as.matrix(matrixB)[i,]
fit <- glm(response ~ predictor)
residuals[i,] <- fit$residuals
}
However, I am getting this error:
Error in residuals[1, ] <- fit$residuals :
incorrect number of subscripts on matrix
I looked up this error a bit, and thought that maybe it did not recognize fit$residuals as a vector, so I tried to specify it (as.vector(fit$residuals)), but that did not fix it.
Any idea on how I can fix this? Thank you!
Format of the matrices (both have the same format)
a b c d f
A 1.0 2.0 3.0 4.0 5.0
B …
C
D
E
F
G
H
I
J
Upvotes: 1
Views: 12092
Reputation: 132576
You would need to preallocate your output vector. However, it's easier/cleaner to use mapply
. If you pass it two vectors (including lists) it iterates simultaneously over both and applies the function to the paired elements. Thus we only need to split the matrices into lists.
A <- matrix(1:9, 3)
B <- A * 3 + 2 + 1/A
t(mapply(function(a, b) {
fit <- lm(b ~ a)
residuals(fit)
}, split(A, letters[1:3]), split(B, letters[1:3])))
# 1 2 3
#a 0.10714286 -0.21428571 0.10714286
#b 0.03750000 -0.07500000 0.03750000
#c 0.01851852 -0.03703704 0.01851852
residuals(lm(B[1,] ~ A[1,]))
# 1 2 3
#0.1071429 -0.2142857 0.1071429
Here is a for
loop that does the same:
result <- matrix(NA, nrow = nrow(A), ncol = ncol(A))
for (i in seq_len(nrow(A))) {
result[i,] <- residuals(lm(B[i,] ~ A[i,]))
}
# [,1] [,2] [,3]
#[1,] 0.10714286 -0.21428571 0.10714286
#[2,] 0.03750000 -0.07500000 0.03750000
#[3,] 0.01851852 -0.03703704 0.01851852
Upvotes: 3