Reputation: 303
I have a data set that records people's qualifications. There are several rows of data per person, with variables in wide format. I need to "widen" it even further so that I have a single row for each person in the data with the variables repeating as columns. You guessed it - the data needs to go into a spreadsheet template.
There will be a maximum of 10 rows per person, but no specified minimum.
Here's a simplified example of the data in its current form:
current <- structure(list(id = c("Bob", "Bob", "Bob", "Bob", "Jim", "Jim",
"Jim", "Jim"), awarding.body = c("SQA", "SQA", "SQA", "SQA",
"SQA", "SQA", "SQA", "SQA"), qual.type = c("HIGHER GRADE", "HIGHER GRADE",
"STANDARD GRADE", "STANDARD GRADE", "HIGHER GRADE", "HIGHER GRADE",
"STANDARD GRADE", "STANDARD GRADE"), year.awarded = c(1998L,
1998L, 1996L, 1996L, 1999L, 1999L, 1997L, 1997L), band = c("A",
"A", "B", "B", "B", "B", "A", "B"), subject = c("Mathematics",
"Chemistry", "French", "Physics", "Fine Art", "Geography", "Craft & Design",
"French")), .Names = c("id", "awarding.body", "qual.type", "year.awarded",
"band", "subject"), class = "data.frame", row.names = c(NA, -8L
))
Here is how I need the data to look
desired <- structure(list(id = c("Bob", "Jim"), awarding.body.1 = c("SQA",
"SQA"), qual.type.1 = c("HIGHER GRADE", "HIGHER GRADE"), year.awarded.1 = 1998:1999,
band.1 = c("A", "B"), subject.1 = c("Mathematics", "Fine Art"
), awarding.body.2 = c("SQA", "SQA"), qual.type.2 = c("HIGHER GRADE",
"HIGHER GRADE"), year.awarded.2 = 1998:1999, band.2 = c("A",
"B"), subject.2 = c("Chemistry", "Geography"), awarding.body.3 = c("SQA",
"SQA"), qual.type.3 = c("STANDARD GRADE", "STANDARD GRADE"
), year.awarded.3 = 1996:1997, band.3 = c("B", "A"), subject.3 = c("French",
"Craft & Design"), awarding.body.4 = c("SQA", "SQA"), qual.type.4 = c("STANDARD GRADE",
"STANDARD GRADE"), year.awarded.4 = 1996:1997, band.4 = c("B",
"B"), subject.4 = c("Physics", "French")), .Names = c("id",
"awarding.body.1", "qual.type.1", "year.awarded.1", "band.1",
"subject.1", "awarding.body.2", "qual.type.2", "year.awarded.2",
"band.2", "subject.2", "awarding.body.3", "qual.type.3", "year.awarded.3",
"band.3", "subject.3", "awarding.body.4", "qual.type.4", "year.awarded.4",
"band.4", "subject.4"), class = "data.frame", row.names = c(NA,
-2L))
I tried various things with the Reshape2 package but I don't think this is a typical reshaping problem? I've looked at various reshaping questions on here but haven't found a solution.
Advice greatly appreciated.
Upvotes: 0
Views: 76
Reputation: 887251
Try
current1 <- transform(current, indx=ave(seq_along(id), id, FUN=seq_along))
desired1 <- reshape(current1, idvar='id', timevar='indx', direction='wide')
row.names(desired1) <- NULL
attr(desired1, 'reshapeWide') <- NULL
all.equal(desired1, desired)
#[1] TRUE
Upvotes: 1