Hed
Hed

Reputation: 45

R converting a long list of questionnaire choices to a dataframe with one row for each questionnaire

A questionnaire was passed to teachers to check their curriculum preferences. They had to choose 20 items from about 50 options. The resulting data is a long list of choices of the following type: Teacher ID, Question ID

i want to format it to a list with one row for each teacher and a colomn per each question with the possible values: 0 (not chosen), 1 (chosen). In pseudo code (of a programming language)
it would probably be something like this:

iterate list {
    data [teacher_id] [question_id] = 0
}

Here is a sample data and the intended result:

a <- data.frame(
    Case_ID = c(1,1,2,2,4,4),
    Q_ID    = c(3,5,5,8,2,6)
)   

intended result is

res <- data.frame(
    Case_ID = c(1,2,4),
    Q_1    = c(0,0,0),
    Q_2    = c(0,0,1),
    Q_3    = c(1,0,0),
    Q_4    = c(0,0,0),
    Q_5    = c(1,1,0),
    Q_6    = c(0,0,1),
    Q_7    = c(0,0,0),
    Q_8    = c(0,1,0)
)

Any help would be greatly appreciated.

Tnx Hed

Upvotes: 3

Views: 104

Answers (2)

Ricardo Saporta
Ricardo Saporta

Reputation: 55360

Note that you can think of a as a list of indecies, which themselves reference which cells in a "master array" are TRUE. Then if you have a master matrix, say res of all 0's, you can then tell R: "all of the elements that are referenced in a should be 1" This is done below

First we create the "master matrix"

# identify the unique teacher ID's
teacherIDs <- unique(a$Case_ID)

# count how many teachers there are
numbTeachers <- length(teacherIDs)

# create the column names for the questions
colNames <- c(paste0("Q_", 1:50))

# dim names for matrix.  Using T_id for the row names
dnames <- list(paste0("T_", teacherIDs), 
              colNames)
# create the matrix
res2 <- matrix(0, ncol=50, nrow=numbTeachers, dimnames=dnames)

Next we convert a to a set of indices.
*Note that the first two lines below are only needed if there are Teacher ID's that are not present. ie in your example, T_3 is not present*

# create index out of a
indx <- a
indx$Case_ID <- as.numeric(as.factor(indx$Case_ID))
indx <- as.matrix(indx)

# populate those in a with 1
res2[indx] <- 1

res2

Upvotes: 0

Matthew Lundberg
Matthew Lundberg

Reputation: 42659

Returning a matrix and using matrix indexing to do the work:

m <- matrix(0, nrow=3, ncol=8)
rownames(m) <- c(1,2,4)
colnames(m) <- 1:8
idx <-apply(a, 2, as.character)
m[idx] <- 1

m
##   1 2 3 4 5 6 7 8
## 1 0 0 1 0 1 0 0 0
## 2 0 0 0 0 1 0 0 1
## 4 0 1 0 0 0 1 0 0

Upvotes: 2

Related Questions