Reputation: 115
I have a longer dataframe with student name, subjects, question names and their marks. A short version of this dataframe looks like the below:
c1 c2 c3 c4
A 1 a 1
A 1 b 0.5
A 1 c 1
A 2 a 2
A 2 b 1.5
A 2 c 3
A 2 d 3
B 1 a 0
B 1 b 1.5
B 1 c 2
B 2 a 2
B 2 b 2.5
B 2 c 4
B 2 d 5
Here A, B are students, 1,2 are subjects, a, b,c are question names and the last column is marks obtained by each student, in each subject and in each question.
I would like to consolidate part of dataframe with question name column becoming heading for several columns with corresponding marks like below:
Student Sub:1 Sub:2
Name a b c a b c d
A 1 0.5 1 2 1.5 3 3
B 0 1.5 2 2 2.5 4 5
I can use transpose function (t(dataframe)
) to switch column to rows but I don't know how to do partly retaining details of students and marks.
Can some one guide how can I achieve this?
Upvotes: 2
Views: 134
Reputation: 101663
Here is base R solution using reshape()
df$namecol <- with(df,paste0(c2,c3))
dfout <- reshape(df[-c(2:3)],direction = "wide",idvar = "c1",timevar = "namecol")
names(dfout) <- gsub(".*\\.(.*)","\\1",names(dfout))
such that
> dfout
c1 1a 1b 1c 2a 2b 2c 2d
1 A 1 0.5 1 2 1.5 3 3
8 B 0 1.5 2 2 2.5 4 5
Upvotes: 0
Reputation: 389012
We can use pivot_wider
, supplying column names from c2
and c3
columns and value from c4
.
tidyr::pivot_wider(df,names_from = c(c2, c3),values_from = c4, names_prefix = "Sub")
# c1 Sub1_a Sub1_b Sub1_c Sub2_a Sub2_b Sub2_c Sub2_d
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A 1 0.5 1 2 1.5 3 3
#2 B 0 1.5 2 2 2.5 4 5
Or with data.table
, dcast
data.table::dcast(df, c1~paste0("Sub",c2)+c3, value.var = "c4")
data
df <- structure(list(c1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"),
c2 = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L,
2L), c3 = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 4L, 1L, 2L,
3L, 1L, 2L, 3L, 4L), .Label = c("a", "b", "c", "d"), class = "factor"),
c4 = c(1, 0.5, 1, 2, 1.5, 3, 3, 0, 1.5, 2, 2, 2.5, 4, 5)), class = "data.frame",
row.names = c(NA, -14L))
Upvotes: 2