Reputation: 9
I am trying to model an existing meta-analysis to examine alternative hypotheses (e.g., doing a random-effects analysis), as well as re-sampling techniques. There are over 2,000 subjects, but the data is fairly simple: a binary outcome, success or failure, linked with a score (0-10) on a structured assessment. I have the frequencies of success or failure for each score, nested within each study. I am looking for a easier way to create the dataset rather than keying it in, or using the rep function multiple times.
I would like each row to look something like this: Study_ID, Test_Result[0-10], Outcome[0 or 1]
For example, let's say I just had two studies and two test levels (1 or 2): study 1 has 35 successes, and 85 failures for score of "1"; for a score of "2," 46 successes and 83 failures. In study 2, for a score of "1" there are 78 successes, 246 failures; for a score of "2," 45 successes and 96 failures.
Using just the frequencies provided, how could I most easily create a data frame with the several hundred lines of data?
Upvotes: 0
Views: 79
Reputation: 32426
This might work, the only thing that should need to be modified to add more studies is the studies
list.
## Your specifications
## Put the lengths of each grouping/study in a list so it's easy to work with
studies <- list(
study1 = c(35, 85, 46, 83),
study2 = c(78, 246, 45, 96))
score <- rep(1:2, each=2) # 1 1 2 2
type <- rep(0:1, len=4) # 0 1 0 1
## Repeat score/type by counts of each grouping/study
res <- lapply(studies, function(study)
data.frame(
score=rep(score, study),
type=rep(type, study)
))
## Combine into data.frame
dat <- data.frame(study=rep(seq_along(studies), times=sapply(studies, sum)),
as.list(do.call(rbind, res)))
head(dat)
# study score type
# 1 1 1 0
# 2 1 1 0
# 3 1 1 0
# 4 1 1 0
# 5 1 1 0
# 6 1 1 0
## Check counts
with(dat, table(type, score, study))
# , , study = 1
#
# score
# type 1 2
# 0 35 46
# 1 85 83
#
# , , study = 2
#
# score
# type 1 2
# 0 78 45
# 1 246 96
Upvotes: 1