Reputation: 85
I am trying to convert parts of a covariance matrix to 0's based on whether parts of that covariance belong to a factor or not. A short example would be a 4x4 matrix that consists of two factors that are made up of x: x1 and x2, x3 and x4.
The following code generates a covariance matrix.
dataframe <- matrix(c(18, 29, 13, 56, 64, 23, 56, 92, 23, 65, 28, 54, 46, 82, 46, 92), 4, 4)
colnames(dataframe) <- (c("x1", "x2", "x3", "x4"))
rownames(dataframe) <- (c("x1", "x2", "x3", "x4"))
cov.d <- cov(dataframe)
round(cov.d, 2)
x1 x2 x3 x4
x1 368.67 294.67 252.33 414.00
x2 294.67 806.25 -161.50 80.83
x3 252.33 -161.50 409.67 446.33
x4 414.00 80.83 446.33 577.00
Assuming x1 and x2 are part of factor 1, and assuming x3 and x4 are part of factor 2, I would like the output to look like the following:
x1 x2 x3 x4
x1 368.67 294.67 0 0
x2 294.67 806.25 0 0
x3 0 0 409.67 446.33
x4 0 0 446.33 577.00
I'd imagine the solution would have something to do with a loop and the replace function.
So far I have tried the following with the following parameters set:
num.factors <- 2; vars.per.factor <- 10; num.vars <- 20
for(k in 1:num.factors)
for(i in 1:vars.per.factor)
for(j in 1:num.vars) {
factor.cov <- replace(cov.d, cov.d[i + k * vars.per.factor - vars.per.factor, j + k * vars.per.factor - vars.per.factor], 0)
}
The issue lies within the replace function. Specifically, the cov.d[i, i]. I know [i, i] is not the code needed to do what I need it to do. However, I'm drawing a blank on where to go from here. I'll be playing around with it and update my progress as I go along.
Thank you for your help!
Upvotes: 2
Views: 969
Reputation: 1417
This is how I would do it using only base R. I do not claim that this is the most efficient code -- I tend to write extra commands to make it extra-simple to read, and also, %in%
slows down your performance if your list of factors was very long or if the factors themselves had a lot of elements -- but it gets the job done for small groups of small factors (and they don't even have to be the same size).
facReplace <- function(m, f) {
# f is a list of factors, f1, f2, ..., fn
# They are combined to make an array called x
# Also make a data-frame copy of the matrix m
x <- do.call("c", f)
m1 <- data.frame(m)
row.names(m1) <- x
names(m1) <- x
# use %in% recursively to set items in m1 that don't share a factor to 0
for (i in 1:length(f)) {
for (j in 1:length(x)) {
for (k in 1:length(x)) {
tempfac <- do.call("c", f[i])
temprow <- x[j]
tempcol <- x[k]
if (!(temprow %in% tempfac) & (tempcol %in% tempfac)) (m1[j, k] <- 0)
}
}
}
return(m1)
}
# Test the function with the original example
set.seed(123)
thedata <- matrix(data = runif(16, 0, 10), nrow = 4, ncol = 4)
thedata
[,1] [,2] [,3] [,4]
[1,] 2.875775 9.404673 5.514350 6.775706
[2,] 7.883051 0.455565 4.566147 5.726334
[3,] 4.089769 5.281055 9.568333 1.029247
[4,] 8.830174 8.924190 4.533342 8.998250
factor1 <- c("x1", "x2")
factor2 <- c("x3", "x4")
theFactors <- list(factor1, factor2)
facReplace(thedata, theFactors)
x1 x2 x3 x4
x1 2.875775 9.404673 0.000000 0.000000
x2 7.883051 0.455565 0.000000 0.000000
x3 0.000000 0.000000 9.568333 1.029247
x4 0.000000 0.000000 4.533342 8.998250
Upvotes: 0
Reputation: 528
I have a feeling this has probably been asked before, but can't find any questions with a quick search.
Anyway, the following will do what you want:
setzero <- function(x) {
x[ ((length(rownames(x))/2) +1) : length(rownames(x)), 1 : (length(rownames(x))/2 )] <- 0
x[ 1:(length(rownames(x))/2) , ((length(rownames(x))/2)+1) : length(rownames(x))] <- 0
return(x)
}
> cov.d <- setzero(cov.d)
> cov.d
x1 x2 x3 x4
x1 368.6667 294.6667 0.0000 0.0000
x2 294.6667 806.2500 0.0000 0.0000
x3 0.0000 0.0000 409.6667 446.3333
x4 0.0000 0.0000 446.3333 577.0000
This works as a quick generic function, as you indicated you wanted. There are probably more elegant solutions.
Upvotes: 0
Reputation: 39154
You can use the row and column index to assign 0.
m <- matrix(data = c(18, 29, 13, 56, 64, 23, 56, 92,
23, 65, 28, 54, 46, 82, 46, 92),
nrow = 4, byrow = TRUE)
m[3:4, 1:2] <- 0
m[1:2, 3:4] <- 0
m
# [,1] [,2] [,3] [,4]
#[1,] 18 29 0 0
#[2,] 64 23 0 0
#[3,] 0 0 28 54
#[4,] 0 0 46 92
Upvotes: 1