rpart function is overplotting or the desired partition is not achieved

Question

  ID Ethnicity MaritalStatus EmploymentStatus type
1 10         5             3                1    3
2 24         1             2                2    1
3 30         1             1                3    4
4 35         2             2                2    3
5 40         5             1                3    4
6 57         1             2                4    1

This is my sample data. the table has almost 94000 rows. I tried the following command

m1 <- rpart(type ~ Ethnicity, MaritalStatus, EmploymentStatus, 
      data = table2, method = "anova")

My objective is to predict the 'type' based on the ethnicity, maritalstatus and emplymentstatus. All the variables were converted into factor datatype using as.factor() but the partition has taken place by ID, whereas I want the partition to happen by Ethnicity, then MaritalStatus and EmploymentStatus. I tried removing the ID column from the dataframe but the same problem exists.
I have attached an image of the results I get and also the corresponding rpart.plot result .
Is my datatype or any basic approach to the data wrong?
I am a beginner to machine learning. I also tried by changing datatype of ID to numeric.
Is there any way to set an hierarchy for partition?
Why is the graph just a line?

overplotted rpart plot

rpart function is overplotting or the desired partition is not achieved

Answers (1)

Related Questions