Reputation: 165
It's typed using LaTex; this is the original text.
From my understanding, VC dimension is kinda largest integer d such that there exists a sample of size d that can be shattered by the hypothesis set H, but in this case, how could we calculate this if H is rectangles?
Upvotes: 2
Views: 2113
Reputation: 19179
For an axis-aligned rectangle in Rn, the VC dimension is (at least) 2*n.
Consider first a rectangle in R1, which is just a pair of min/max values (a1, b1) along the x1 axis (not actually a rectangle). This pair of values has VC dimension of 2 because for any two points in R1, you can set (a1, b1) such that one, both, or neither of the two points is between them. But when you add a third point along that axis, there is no way you can include the two outer points in the range (a1, b1) without also including the middle point.
To make it easier to visualize, suppose the two points lie at x1 = -2
and x1 = 2
, respectively. You can shatter the set by defining your rectangle bounds (a1, b1) as
(-3, 3) -> both points included
(-3, 1) -> first point only
(-1, 3) -> second point only
(-1, 1) -> neither point included
Now suppose you add an additional dimension so your space is in R2 (i.e., now it is a true rectangular region). Add two more points and let the x1 coordinates of the new points be 0 so they will always be included along the first dimension (for the rectangles defined above) and set the x2 coordinates of the new points to -2 and 2, respectively. Also, set the x2 coordinates of the first two points to 0.
When you set (a1, b1, a2, b2) to (-3, 3, -3, 3) you will include all 4 points. And you can obviously exclude any point by simply be reducing the magnitude of one of the 4 bounds from 3 to 1. Since you can include any subset of the 4 points by changing the R2 rectangle's corresponding bound, the VC dimension of the R2 rectangle is at least 4.
It should be fairly obvious that you can repeat this procedure for an arbitrary number of dimensions. Each time you add a new dimension xi, set the xi coordinate of all points to 0 except for the two new points you add for that dimension (they will be at xi = -2
and xi = 2
, respectively).
Since you can shatter 2 additional points for every dimension, then for an axis-aligned rectangle in Rn, the VC dimension will be at least 2*n.
Upvotes: 1