Reputation: 4883
So I'm not really sure about the best way to achieve what I want to do.
Here is my problem: I'm using multiple imputation in R with the package mice and I'm using the function cox.zph to get the residuals for each imputed model. I would like to average all the imputed residuals to get only one value for each variable. And honestly I have no idea how to achieve this. I've looked at lapply and mapply.
res.scho <- NULL
for (i in 1:m) {
res.scho[[i]] <- cox.zph(modf$analyses[[i]])
}
res.scho[[1]]
rho chisq p
Alc=1 0.0622 0.552 4.58e-01
HIV=1 0.1050 1.873 1.71e-01
Diabetes=1 -0.1240 2.227 1.36e-01
age -0.1388 2.877 8.99e-02
GLOBAL NA 44.467 5.97e-08
Basically I have m imputed dataset and res.scho[[m]] and I would like to combine each res.scho into one.
I'm still very stuck at forloops that I used in other languages and I'm having some problems to use mapply. This could be one of the issue. Nevertheless, I would be really grateful if someone could give me some pointers to help me achieve this and better use R.
Thank you!
EDIT
EXpected output:
Let's say I have two imputed datasets. m=2
res.scho[[1]]
rho chisq p
Alc=1 0.0622 0.552 4.58e-01
HIV=1 0.1050 1.873 1.71e-01
Diabetes=1 -0.1240 2.227 1.36e-01
age -0.1388 2.877 8.99e-02
GLOBAL NA 44.467 5.97e-08
res.scho[[2]]
rho chisq p
Alc=1 0.0522 0.752 5.58e-01
HIV=1 0.1550 1.473 2.71e-01
Diabetes=1 -0.1140 2.927 4.36e-01
age -0.1188 2.077 3.99e-02
GLOBAL NA 44.400 7.97e-08
My desired output would be in the same form as the list but the average of the two for each value, for instance:
Average_res.scho
rho chisq p
Alc=1 0.0572 0.652 5.08e-01
HIV=1 0.1300 1.673 2.21e-01
Diabetes=1 -0.1190 2.577 2.86e-01
age -0.1288 2.477 4.24e-02
GLOBAL NA 44.433 6.97e-08
for instance the column of rho is obtained by colum (res.scho[[1]] + res.scho[[2]]) /2
EDIT1
Following the suggestions of konvas I'm trying to use his ideas to get the desired output. Here is what I have so far:
rho <- NULL
chisq <- NULL
p <- NULL
for (i in 1:70) {
rho[[i]] <- res.scho[[i]]$table[,"rho"]
chisq[[i]] <- res.scho[[i]]$table[,"chisq"]
p[[i]] <- res.scho[[i]]$table[,"p"]
I've extracted a list of lists for each column of res.scho - not the perfect solution. So if I do rho[[1]] I see the first column
[[1]]
CliForm=1 Sit=2 Alc=1 HIV=1 Diabetes=1 age GLOBAL
0.17300198 -0.45800541 0.06224951 0.10495093 -0.12401631 -0.13879592 NA
Now, I will thinking about to this for rho, chisq and p:
for (i in 1:70) {
result <- sapply(names(rho[[1]]),
function(x) colMeans(sapply(rho, "[[", x)))
}
And I get the following error that I've been trying to solve:
Error in colMeans(sapply(rho, "[[", x)) : 'x' must be an array of at least two dimensions
Upvotes: 0
Views: 217
Reputation: 14346
First, extract the relevant matrix of coefficients for each entry of the list res.scho
:
res.scho.tables <- lapply(res.scho, `[[`, "table")
Then since all you want is an average (and matrices can be added up elementwise quite fast) you can call
Average_res.scho <- do.call(`+`, res.scho.tables) / length(res.scho.tables)
Upvotes: 1