William Clarke
William Clarke

Reputation: 43

How to convert predicted values into binary variables and save them to a CSV

I have made a decision tree model on test data then used it to predict vales in a test dataset.

dtpredict<-predict(ct1, testdat, type="class")

The output looks like:

      1       2       3       4       5       6 
    Class_2 Class_2 Class_6 Class_2 Class_8 Class_2 

I want to write a csv to look like:

id, Class_1, Class_2, Class_3, Class_4, Class_5, Class_6, Class_7, Class_8, Class_9
1, 0, 1, 0, 0, 0, 0, 0, 0, 0
2, 0, 1, 0, 0, 0, 0, 0, 0, 0
3, 0, 0, 0, 0, 0, 1, 0, 0, 0
4, 0, 1, 0, 0, 0, 0, 0, 0, 0
5, 0, 0, 0, 0, 0, 0, 0, 1, 0
6, 0, 1, 0, 0, 0, 0, 0, 0, 0

Upvotes: 3

Views: 293

Answers (2)

Hillary Sanders
Hillary Sanders

Reputation: 6047

Uh, what are the 010101's - logicals? If so they don't make much sense in your example all are class 1 (doesn't correspond to your example dtpredict). If they are logicals....

# if dtpredict is a factor vector, where the values are the classes
# and the names are the boolean values:
values = as.numeric(as.character(names(dtpredict)))
classes = as.character(dtpredict)
x = data.frame(id=names(classes))
for(class in sort(unique(classes)){
     x[ , class] = as.numeric(sapply(classes, FUN=function(p) p==class])
}
write.csv(x, 'blah.csv')

Upvotes: 0

Dominic Comtois
Dominic Comtois

Reputation: 10411

There's a package called dummies that does that well...

install.packages("dummies")
library(dummies)

x <- factor(c("Class_2", "Class_2", "Class_6", "Class_2", "Class_8", "Class_2"),
            levels = paste("Class", 1:9, sep="_"))

dummy(x, drop = FALSE)

     xClass_1 xClass_2 xClass_3 xClass_4 xClass_5 xClass_6 xClass_7 xClass_8 xClass_9
[1,]        0        1        0        0        0        0        0        0        0
[2,]        0        1        0        0        0        0        0        0        0
[3,]        0        0        0        0        0        1        0        0        0
[4,]        0        1        0        0        0        0        0        0        0
[5,]        0        0        0        0        0        0        0        1        0
[6,]        0        1        0        0        0        0        0        0        0

All that remains is to get rid of the "x" but this should not be too hard with something like this:

d <- dummy(x,drop = FALSE)
colnames(d) <- sub("x", "", colnames(d))

and then to save to disk:

write.csv(d, "somefile.csv", row.names = FALSE)

Upvotes: 1

Related Questions