Reputation: 239
I have some data which I would like to segment.
My first thought was classification tree in R from Rpart package.
My training data consists of many explanatory variables and one 0-1 response variable named "sold". The response value "1" appears in about 80% of rows.
When I try to build a tree with rpart(sold~., training_data, method = "class")
, R is unable to create a tree. I suppose that the reason is that it can't find any segments which differ very much from one another. After quick inspection of the data, I expect that my tree should look like that left node will have 85% of sold and right node will have 75% of sold.
Is there any way to create a classification tree on such data set?
Upvotes: 2
Views: 1858
Reputation: 1624
I had same problem. It seems to be a problem about 'cp'. Refer to my code:
tr1<-rpart(bad~group+amount, data=ra,
control=rpart.control(minsplit=5, cp=0.001),method='class')
When I run this, it worked. when I increase cp level (e.g. 0.005), it didn't work.
Upvotes: 1