Reputation: 11
I have data of the following format
Reg_No Subject
AA11 Physics
AA11 Chemistry
AA12 English
AA12 Maths
AA12 Physics
I am trying to transform this data into row wise
Physics Chemistry
English Maths Physics
I know that each student can take maximum of 8 subjects
I am trying to create a matrix that can store the above data as variable rows (each student has different number of subjects)
I have written the following code
# read csv file
Term4 <- read.csv("Term4.csv")
# Find number of Students
Matrix_length <- length(unique(Term4$Reg_No))
# Uniquely store their reg number
Student <- unique(Term4$Reg_No)
# create matrix to be inserted as csv
out <- matrix(NA, nrow=Matrix_length , ncol=8) # max subjects = 8 so ncol =8
# iterate to get each reg number's subjects
for (n in 1:Matrix_length) {
y <- Term4[Term4[,"Reg_No"] == Student[n],]$Subject
# transpose Courses as a single column into row and insert it in the matrix
out[n,] <- t(y)
}
I am getting the following error
Error in out[n, ] <- t(y) :
number of items to replace is not a multiple of replacement length
Could anyone please tell me how to work on this error
Thanks and Regards
Upvotes: 0
Views: 355
Reputation: 35324
reshape()
can do this:
df <- data.frame(Reg_No=c('AA11','AA11','AA12','AA12','AA12'), Subject=c('Physics','Chemistry','English','Maths','Physics') );
reshape(transform(df,time=ave(c(Reg_No),Reg_No,FUN=seq_along)),dir='w',idvar='Reg_No');
## Reg_No Subject.1 Subject.2 Subject.3
## 1 AA11 Physics Chemistry <NA>
## 3 AA12 English Maths Physics
This will generate a data.frame with as many columns as are necessary to cover all subjects.
The reason your code is failing is that you've preallocated your matrix with 8 columns, but the RHS of each assignment will only contain as many subjects as the current student n
has in the original data.frame. R rejects index-assignments whose target length is not divisible by the RHS length (actually for plain vectors it will just be a warning, but for matrices it seems to be an error; regardless, it's probably never the right thing to do).
In general, if you ever do need to carry out such a non-divisible assignment, you can do it by extending the RHS to sufficient length by appending NAs. This could be done with rep()
and c()
, but there's actually an elegant and easy way to do it using out-of-bounds indexing. Here's a demo:
m <- matrix(NA_character_,2,8);
m;
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] NA NA NA NA NA NA NA NA
## [2,] NA NA NA NA NA NA NA NA
m[1,] <- letters[1:3]; ## fails; indivisible
## Error in m[1, ] <- letters[1:3] :
## number of items to replace is not a multiple of replacement length
m[2,] <- letters[1:3][1:ncol(m)]; ## works
m;
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] NA NA NA NA NA NA NA NA
## [2,] "a" "b" "c" NA NA NA NA NA
Upvotes: 2