Jerry
Jerry

Reputation: 9

Using frequency data to populate dataframe in R

I am trying to model an existing meta-analysis to examine alternative hypotheses (e.g., doing a random-effects analysis), as well as re-sampling techniques. There are over 2,000 subjects, but the data is fairly simple: a binary outcome, success or failure, linked with a score (0-10) on a structured assessment. I have the frequencies of success or failure for each score, nested within each study. I am looking for a easier way to create the dataset rather than keying it in, or using the rep function multiple times.

I would like each row to look something like this: Study_ID, Test_Result[0-10], Outcome[0 or 1]

For example, let's say I just had two studies and two test levels (1 or 2): study 1 has 35 successes, and 85 failures for score of "1"; for a score of "2," 46 successes and 83 failures. In study 2, for a score of "1" there are 78 successes, 246 failures; for a score of "2," 45 successes and 96 failures.

Using just the frequencies provided, how could I most easily create a data frame with the several hundred lines of data?

Upvotes: 0

Views: 79

Answers (1)

Rorschach
Rorschach

Reputation: 32426

This might work, the only thing that should need to be modified to add more studies is the studies list.

## Your specifications
## Put the lengths of each grouping/study in a list so it's easy to work with
studies <- list(
    study1 = c(35, 85, 46, 83),
    study2 = c(78, 246, 45, 96))
score <- rep(1:2, each=2) # 1 1 2 2
type <- rep(0:1, len=4)   # 0 1 0 1

## Repeat score/type by counts of each grouping/study
res <- lapply(studies, function(study)
    data.frame(
        score=rep(score, study),
        type=rep(type, study)
    ))

## Combine into data.frame
dat <- data.frame(study=rep(seq_along(studies), times=sapply(studies, sum)),
                  as.list(do.call(rbind, res)))
head(dat)
#   study score type
# 1     1     1    0
# 2     1     1    0
# 3     1     1    0
# 4     1     1    0
# 5     1     1    0
# 6     1     1    0

## Check counts
with(dat, table(type, score, study))
# , , study = 1
# 
#     score
# type   1   2
#    0  35  46
#    1  85  83
# 
# , , study = 2
# 
#     score
# type   1   2
#    0  78  45
#    1 246  96

Upvotes: 1

Related Questions