Sasak
Sasak

Reputation: 199

From multiple values to categorical values

Having a dataframe like this:

   structure(list(price = structure(1:4, .Label = c("price1", "price2", 
"price3", "price4"), class = "factor"), col1 = structure(c(1L, 
2L, NA, 3L), .Label = c("text1", "text2", "text3"), class = "factor"), 
    col2 = structure(c(NA, 1L, NA, NA), .Label = "text1", class = "factor"), 
    col3 = structure(c(NA, 1L, NA, NA), .Label = "text3", class = "factor"), 
    col4 = structure(c(NA, 1L, NA, NA), .Label = "text4", class = "factor")), .Names = c("price", 
"col1", "col2", "col3", "col4"), class = "data.frame", row.names = c(NA, 
-4L))

How is it possible to change the values of every row to column names and have exist or not (1 and 0) values?

Example output:

price text1 text2 text3 text4
price1 1      0    0      0
price2 1      1    1      1
price3 0      0    0      0
price4 0      0    1      0

Upvotes: 0

Views: 34

Answers (1)

akrun
akrun

Reputation: 887118

We create a logical matrix for the columns other than 1 using is.na, coerce it to binary (+) and assign the output back to the subset data

df1[-1] <- +(!is.na(df1[-1]))
df1
#   price col1 col2
#1 price1    1    0
#2 price2    1    1
#3 price3    0    0

Or another option is lapply

df1[-1] <- lapply(df1[-1], function(x) as.integer(!is.na(x)))

For the new dataset

library(data.table)
dcast(melt(setDT(df2), id.var = 'price', na.rm = TRUE), 
            price ~value, length,drop = FALSE)
#    price text1 text2 text3 text4
#1: price1     1     0     0     0
#2: price2     1     1     1     1
#3: price3     0     0     0     0
#4: price4     0     0     1     0

Upvotes: 1

Related Questions