Reputation: 502
I am aware of the spread
function in the tidyr
package but this is something I am unable to achieve.
I have a data.frame
with 2 columns as defined below. I need to transpose the column Subject
into binary columns with 1 and 0.
Below is the data frame:
studentInfo <- data.frame(StudentID = c(1,1,1,2,3,3),
Subject = c("Maths", "Science", "English", "Maths", "History", "History"))
> studentInfo
StudentID Subject
1 1 Maths
2 1 Science
3 1 English
4 2 Maths
5 3 History
6 3 History
And the output I am expecting is:
StudentID Maths Science English History
1 1 1 1 1 0
2 2 1 0 0 0
3 3 0 0 0 1
How can I do this with the spread()
function or any other function.
Upvotes: 14
Views: 5784
Reputation: 47310
Using tidyr :
library(tidyr)
studentInfo <- data.frame(
StudentID = c(1,1,1,2,3,3),
Subject = c("Maths", "Science", "English", "Maths", "History", "History"))
pivot_wider(studentInfo,
names_from = "Subject",
values_from = 'Subject',
values_fill = 0,
values_fn = function(x) 1)
#> # A tibble: 3 x 5
#> StudentID Maths Science English History
#> <dbl> <int> <int> <int> <int>
#> 1 1 1 1 1 0
#> 2 2 1 0 0 0
#> 3 3 0 0 0 1
Created on 2019-09-19 by the reprex package (v0.3.0)
Upvotes: 8
Reputation: 887098
We can use table
from base R
+(table(studentInfo)!=0)
# Subject
#StudentID English History Maths Science
# 1 1 0 1 1
# 2 0 0 1 0
# 3 0 1 0 0
Upvotes: 7
Reputation: 26258
Using reshape2
we can dcast
from long to wide.
As you only want a binary outcome we can unique
the data first
library(reshape2)
si <- unique(studentInfo)
dcast(si, formula = StudentID ~ Subject, fun.aggregate = length)
# StudentID English History Maths Science
#1 1 1 0 1 1
#2 2 0 0 1 0
#3 3 0 1 0 0
Another approach using tidyr
and dplyr
is
library(tidyr)
library(dplyr)
studentInfo %>%
mutate(yesno = 1) %>%
distinct %>%
spread(Subject, yesno, fill = 0)
# StudentID English History Maths Science
#1 1 1 0 1 1
#2 2 0 0 1 0
#3 3 0 1 0 0
Although I'm not a fan (yet) of tidyr
syntax...
Upvotes: 16