Reputation: 4575
I have a dataframe of the form below which has two fields. field1 is a categorical field with only two values TRUE or FALSE, and field2 is a list. I would like to parse field2 and create new columns in my dataframe one for each unique value in the character list. For example, I would like to create 3 new columns from the data below, the columns would be Bas, ants, and onal. I would like the new columns to contain a flag TRUE or FALSE indicating that the field2 value for that row contained the value the field is named for. For example, the row 1 value for the new column Bas would be TRUE, the row 1 value of column onal would be FALSE. In Python there's a function called getdummies that does something similar. I'm not sure if there's an equivalent in r, also I'm not exactly sure how to parse the list. Any tips are greatly appreciated.
Sample data:
structure(list(field1 = c("False", "TRUE"), field2 = list(
c("Bas", "ants"), c("Bas", "onal"))), .Names = c("field1",
"field2"), row.names = c(1904L, 1968L), class = "data.frame")
Here is a sample of what I'd like the output to look like:
structure(list(field1 = c(FALSE, TRUE), field2 = list(
c("Bas", "ants"), c("Bas", "onal")), class = "factor"), Bas = c(TRUE, TRUE
), ants = c(TRUE, FALSE), onal = c(FALSE, TRUE)), .Names = c("field1",
"field2", "Bas", "ants", "onal"), class = "data.frame", row.names = c(NA,
-2L))
Upvotes: 0
Views: 80
Reputation: 388
Try this.
data<-structure(list(field1 = c("False", "TRUE"), field2 = c("Bas", "Bas"),field3=c("ants", "onal")), .Names = c("field1",
"field2","field3"), row.names = c(1904L, 1968L), class = "data.frame")
library(reshape2)
newdata <-melt(data,id.vars=c("field3","field2"))
x<-acast(newdata, value~field3)
y<-acast(newdata, value~field2)
final=cbind(x,y)
Upvotes: 0
Reputation: 887213
We can use mtabulate
from library(qdapTools)
library(qdapTools)
!!(mtabulate(df1$field2))
# ants Bas onal
#[1,] TRUE TRUE FALSE
#[2,] FALSE TRUE TRUE
df1 <- structure(list(field1 = c("False", "TRUE"),
field2 = list(c("Bas",
"ants"), c("Bas", "onal"))), .Names = c("field1", "field2"),
row.names = c(1904L, 1968L), class = "data.frame")
Upvotes: 1