Reputation: 199
Having a dataframe like this:
structure(list(price = structure(1:4, .Label = c("price1", "price2",
"price3", "price4"), class = "factor"), col1 = structure(c(1L,
2L, NA, 3L), .Label = c("text1", "text2", "text3"), class = "factor"),
col2 = structure(c(NA, 1L, NA, NA), .Label = "text1", class = "factor"),
col3 = structure(c(NA, 1L, NA, NA), .Label = "text3", class = "factor"),
col4 = structure(c(NA, 1L, NA, NA), .Label = "text4", class = "factor")), .Names = c("price",
"col1", "col2", "col3", "col4"), class = "data.frame", row.names = c(NA,
-4L))
How is it possible to change the values of every row to column names and have exist or not (1 and 0) values?
Example output:
price text1 text2 text3 text4
price1 1 0 0 0
price2 1 1 1 1
price3 0 0 0 0
price4 0 0 1 0
Upvotes: 0
Views: 34
Reputation: 887118
We create a logical matrix for the columns other than 1 using is.na
, coerce it to binary (+
) and assign the output back to the subset data
df1[-1] <- +(!is.na(df1[-1]))
df1
# price col1 col2
#1 price1 1 0
#2 price2 1 1
#3 price3 0 0
Or another option is lapply
df1[-1] <- lapply(df1[-1], function(x) as.integer(!is.na(x)))
For the new dataset
library(data.table)
dcast(melt(setDT(df2), id.var = 'price', na.rm = TRUE),
price ~value, length,drop = FALSE)
# price text1 text2 text3 text4
#1: price1 1 0 0 0
#2: price2 1 1 1 1
#3: price3 0 0 0 0
#4: price4 0 0 1 0
Upvotes: 1