KatyB
KatyB

Reputation: 3980

cor.test into data.frame in R

consider the following example:

require(MuMIn)
data(Cement)
d <- data.frame(Cement)

idx <- seq(11,13)
cor1 <- list()
for (i in 1:length(idx)){
  d2 <- d[1:idx[i],]
  cor1[[i]] <- cor.test(d2$X1,d2$X2, method = "pearson")
}
out <- lapply(cor1, function(x) c(x$estimate, x$conf.int, x$p.value))

Here I calculate the correlation for a dataset within an iteration loop.

I know want to generate one data.frame made up of the values in the list 'out'. I try using

df <- do.call(rbind.data.frame, out)

but the result does not seem right:

> df
  c.0.129614123011664..0.195326511912326..0.228579470307565.
1                                                  0.1296141
2                                                  0.1953265
3                                                  0.2285795
  c..0.509907346173941...0.426370467476045...0.368861726657293.
1                                                    -0.5099073
2                                                    -0.4263705
3                                                    -0.3688617
  c.0.676861607564929..0.691690831088494..0.692365536706126.
1                                                  0.6768616
2                                                  0.6916908
3                                                  0.6923655
  c.0.704071702633775..0.542941653020805..0.452566184329491.
1                                                  0.7040717
2                                                  0.5429417
3                                                  0.4525662

This is not what I am after.

How can I generate a data.frame that has the first column expressing which list the cor.test was calcuated i.e. 1 to 3 in this case, the second column referring to the $estimate and then $conf.int and %p.value resulting in a five column data.frame.

Upvotes: 1

Views: 1184

Answers (2)

Rich Scriven
Rich Scriven

Reputation: 99321

Is this what you're trying to do? Your question is a bit hard to understand. Is a column of indices from the list really necessary? The whole first column will be exactly the same as the row names (which appear on the left-hand side).

> D <- data.frame(cbind(index = seq(length(out)), do.call(rbind, out)))
> names(D)[2:ncol(D)] <- c('estimate', paste0('conf.int', 1:2), 'p.value')
> D
  index  estimate  conf.int1 conf.int2   p.value
1     1 0.1296141 -0.5099073 0.6768616 0.7040717
2     2 0.1953265 -0.4263705 0.6916908 0.5429417
3     3 0.2285795 -0.3688617 0.6923655 0.4525662

Upvotes: 2

lebatsnok
lebatsnok

Reputation: 6449

It's not entirely clear what you're asking ... you have there such a data frame, just without reasonable column names. You can simplify your code to ..

ctests <- lapply(idx, function(x) cor.test(d[1:x,"X1"], d[1:x, "X2"]))
ctests <- lapply(ctests, "[", c("estimate", "conf.int", "p.value"))
as.data.frame(do.call(rbind, lapply(ctests, unlist)))
#   estimate.cor  conf.int1 conf.int2   p.value
# 1    0.1296141 -0.5099073 0.6768616 0.7040717
# 2    0.1953265 -0.4263705 0.6916908 0.5429417
# 3    0.2285795 -0.3688617 0.6923655 0.4525662

Is this what you need?

Upvotes: 1

Related Questions