Change data from wide to long format in r

Question

I have data where students are rated by two raters each on multiple questions. Each row contains these variables:

student ID,
the ID for the first rater for the first item
the rating assigned by the first rater for the first item,
the ID for the second rater for the first item
the rating assigned by the second rater for the first item

....and then it repeats for multiple items.

It looks something like this:

Student_ID  <- c(1:4)
Item1_first_rater_id <- c(1,2,1,2)
Item1_first_rating <- c(2,3,4,2)
Item1_second_rater_id <- c(2,3,2,3)
Item1_second_rating <- c(4,5,3,2)
Item2_first_rater_id <- c(4,2,5,1)
Item2_first_rating <- c(2,3,4,2)
Item2_second_rater_id <- c(6,7,2,3)
Item2_second_rating <- c(3,4,5,4)

wide <- data.frame(Student_ID, Item1_first_rater_id, Item1_first_rating, 
                          Item1_second_rater_id, Item1_second_rating, 
                          Item2_first_rater_id, Item2_first_rating, 
                          Item2_second_rater_id, Item2_second_rating)

I need the data to be in a long format like this:

Student_ID  <- c(1:4)
Item_number <- c(1,1,2,2)
Rater_id <- c(1:4)
Score <- c(2,3,4,5)
long <- data.frame(Student_ID, Item_number, Rater_id, Score)

Any ideas about how to reshape?

Thanks.

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer

It isn't totally clear what you're trying to do (in other words, how exactly you want to transform your source data). Here is one guess that might at least get you closer to your desired output.

It seems like the names in your "wide" dataset contain three sets of information: (1) an item number, (2) a "time" (first or second), and (3) another variable (either "rating" or "rater id").

We can use melt, colsplit, and dcast to facilitate our reshaping.

Step 1: `melt` the dataset

library(reshape2)
orignames <- names(wide) # Store the original names so we can replace them
names(wide) <- gsub("Item([0-9])_(.*)_(rater_id|rating)", 
                    "\1\.\2\.\3", names(wide))
# "melt" the dataset
m.wide <- melt(wide, id.vars="Student_ID")
head(m.wide)
#   Student_ID         variable value
# 1          1 1.first.rater_id     1
# 2          2 1.first.rater_id     2
# 3          3 1.first.rater_id     1
# 4          4 1.first.rater_id     2
# 5          1   1.first.rating     2
# 6          2   1.first.rating     3

Step 2: Create the new columns using `colsplit`

m.wide <- cbind(m.wide, 
                colsplit(m.wide$variable, "\.", 
                         c("Item", "Time", "Var")))
head(m.wide)
#   Student_ID         variable value Item  Time      Var
# 1          1 1.first.rater_id     1    1 first rater_id
# 2          2 1.first.rater_id     2    1 first rater_id
# 3          3 1.first.rater_id     1    1 first rater_id
# 4          4 1.first.rater_id     2    1 first rater_id
# 5          1   1.first.rating     2    1 first   rating
# 6          2   1.first.rating     3    1 first   rating

Step 3: Use `dcast` to reshape the data

dcast(m.wide, Student_ID + Item ~ Time + Var, value.var="value")
#   Student_ID Item first_rater_id first_rating second_rater_id second_rating
# 1          1    1              1            2               2             4
# 2          1    2              4            2               6             3
# 3          2    1              2            3               3             5
# 4          2    2              2            3               7             4
# 5          3    1              1            4               2             3
# 6          3    2              5            4               2             5
# 7          4    1              2            2               3             2
# 8          4    2              1            2               3             4

Switching what's to the left and what's to the right of the ~ will affect the "shape" of your data.

Change data from wide to long format in r

Answers (1)

Step 1: `melt` the dataset

Step 2: Create the new columns using `colsplit`

Step 3: Use `dcast` to reshape the data

Related Questions

Change data from wide to long format in r

Answers (1)

Step 1: melt the dataset

Step 2: Create the new columns using colsplit

Step 3: Use dcast to reshape the data

Related Questions

Step 1: `melt` the dataset

Step 2: Create the new columns using `colsplit`

Step 3: Use `dcast` to reshape the data