Reputation: 9
Troubles with R subsetting and arranging datasets. I have a dataset that looks like this:
Student Skill Correct
64525 10 1
64525 10 1
70363 10 0
70363 10 1
70363 10 1
64525 15 0
70363 15 0
70363 15 1
I would need to create a new dataset for each skill, with a row for each student and a column for each observation (Correct). Like this:
Skill: 10
Student Obs1 Obs2 Obs3
64525 1 1 NA
70363 0 1 1
Skill: 15
Student Obs1 Obs2
64525 0 NA
70363 0 1
Notice that the number of columns of each skill dataset can vary, depending on the numebr of observations for each student. Notice also that the value can be a NA if there is not such an observation in the dataset (a student can try the skill a different number of times than other students).
I think this might e a job for the dplyr package but I am not sure.
I really appreciate the help of the community!!
Upvotes: 0
Views: 129
Reputation: 92282
Here's a possible data.table
implementation
library(data.table) # V 1.10.0
res <- setDT(df)[, .(.(dcast(.SD, Student ~ rowid(Student)))), by = Skill]
Which will result in a data.table
of data.table
s
res
# Skill V1
# 1: 10 <data.table>
# 2: 15 <data.table>
Which could be segmented by the Skill
column
res[Skill == 10, V1]
# [[1]]
# Student 1 2 3
# 1: 64525 1 1 NA
# 2: 70363 0 1 1
Or in order to see the whole column
res[, V1]
# [[1]]
# Student 1 2 3
# 1: 64525 1 1 NA
# 2: 70363 0 1 1
#
# [[2]]
# Student 1 2
# 1: 64525 0 NA
# 2: 70363 0 1
Upvotes: 1
Reputation: 70603
This will get the job done.
xy <- read.table(text = "Student Skill Correct
64525 10 1
64525 10 1
70363 10 0
70363 10 1
70363 10 1
64525 15 0
70363 15 0
70363 15 1", header = TRUE)
# first split by skill and work on each element
sapply(split(xy, xy$Skill), FUN = function(x) {
# extract column correct
out <- sapply(split(x, x$Student), FUN = "[[", "Correct")
# pad shortest vectors with NAs at the end
out <- mapply(out, max(lengths(out)), FUN = function(m, a) {
c(m, rep(NA, times = (a - length(m))))
}, SIMPLIFY = FALSE)
do.call(rbind, out)
})
$`10`
[,1] [,2] [,3]
64525 1 1 NA
70363 0 1 1
$`15`
[,1] [,2]
64525 0 NA
70363 0 1
Upvotes: 0