Reputation: 1709
How do I convert df into df2, where df is given by:
> df
ID VALUES
1 1 a,b,c,d
2 2 a
3 3 c,d,f,g
and df2 should look something like:
> df2
ID a b c d f g
1 1 1 1 1 1 0 0
2 2 1 0 0 0 0 0
3 3 0 0 1 1 1 1
where the values from df have been broken out into separate columns and 1s and 0s reflect whether or not the ID was associated with that value (from df).
Is there a specific function for this? I thought this is what table() did but if that's the case I can't figure it out.
Upvotes: 2
Views: 40
Reputation: 263362
Here's a method that uses no extra packages:
0 + t( sapply(df[['VALUES']], function(x) {
letters[1:6] %in% scan(text=x, what="", sep=",") }))
Read 4 items
Read 1 item
Read 4 items
[,1] [,2] [,3] [,4] [,5] [,6]
a,b,c,d 1 1 1 1 0 0
a 1 0 0 0 0 0
c,d,f,g 0 0 1 1 0 1
It does return a matrix and it does depend on the VALUES column being character rather than factor. If you want to suppress the information messages from scan
there is a parmeter for that. You could cbind
this with the ID column:
cbind( df["ID"], 0+ t( sapply(df[['VALUES']], function(x) {letters[1:6] %in% scan(text=x, what="", sep="," , quiet=TRUE) })) )
ID 1 2 3 4 5 6
a,b,c,d 1 1 1 1 1 0 0
a 2 1 0 0 0 0 0
c,d,f,g 3 0 0 1 1 0 1
Upvotes: 3