Saobi
Saobi

Reputation: 17041

Question About VC Dimension

If I have the input space of (1,2,....999). And I have a concept class C, with 10 concepts: C0,C1,C2...C9.

Given an input, that input is an element of ci if the it contains the digit i. For example, the number 123 is an element of c1 and c2 and c3.

What is the VC Dimension of this concept class C?

Upvotes: 0

Views: 1242

Answers (2)

charles.fox
charles.fox

Reputation: 493

Is this answer really correct?

Shattering means that for your chosen set of data points eg. (14,24,3), for every possible labelling of it, there exists a concept in the set consistent with that labelling.

But consider the example (14,24,3) given, here is a list of all possible true/false labellings for those three points, and which classes are consistent with them:

0 0 0 C_5, C_6, C_7, C_8, C_9, C_0 all consistent with this

0 0 1 C_3 only (because the third number is three, and only class C_3 contains it)

0 1 0 C_2 and C_4 (because "24" contains 2 and 4)

0 1 1 C_2, C_4 and C_3

1 0 0 C_1 and C_4

1 0 1 no consistent classes (because "14" and "3" dont share any digits in common)

1 1 0 C_4 (because "14" and "24" both contain a 4)

1 1 1 no consistent classes

So the class set does not shatter this data set. (Or am I misunderstanding something in the definition?)

Upvotes: 0

Stompchicken
Stompchicken

Reputation: 15961

I don't want to post the whole solution here, but here's something...

Finding the VC dimension involves finding sets of points in the input space that can be shattered by C.

I can easily find a set of three points that can be shattered by C, (14, 24, 3).

It's harder to find a set of four points that can be shattered by C, but (157, 256, 367, 4) works.

It's very, very hard to find five points that can be shattered by C, which strongly suggests that the VC dimension of C (given the input space) is 4. However, the tricky part is proving that it's impossible to find any set of five points that can be shattered.


Actually, there may be some ambiguity in the question. It depends in what sense the concept class can 'correctly classify' a set of points. i.e. Does C1 correctly classify (1, 2) where 1 is given a negative class label and 2 is given a positive one (since it partitions it correctly), or can only C2 do that? I've assumed that it can, because the problem is slightly more interesting that way.

Upvotes: 2

Related Questions