Reputation: 179
It just come across my mind that, for example, suppose we have training data with N
points in 2 dimensions. We know that we can always, naively, build a decision tree so that we can classify each data point. (probably we are overfitting, and depth can go to 2N)
However, we know that if the data set is linear separable, then the decision tree can take an advantage, probably. Use the above data set as an example, can we determine the upper bound of depth for linear and nonlinear data set? Is it guaranteed that the upper bound of depth for linear case is smaller then nonlinear case?
Upvotes: 1
Views: 819
Reputation: 9410
A little bit too late, but still, you can look at this example, at which non separable linearly dataset requires less splits that linearly separable.
Upvotes: 3
Reputation: 6544
suppose we have training data with N points in 2 dimensions. We know that we can always, naively, build a decision tree so that we can classify each data point.
This is not true if there are 2 points with the same features but different labels.
Decision trees make splits based off of the axis, so linearly separably doesn't necessarily reduce the number of splits you need in a tree to separate the classes.
Is it guaranteed that the upper bound of depth for linear case is smaller then nonlinear case?
No. An easy counter proof is to construct a linearly separable data set with 2*N points and N features. For class A, all feature values are negative. For class B, all feature values are positive. Let each data point have only 1 non-zero feature value.
This dataset requires splitting on every feature (and thus growing out to maximal depth) to learn, despite being linearly separable.
Upvotes: 1