Nick Corona
Nick Corona

Reputation: 45

Why there is no explicit ordering in handwritten digit classification?

From chapter 2 of The Elements of Statistical Learning:

from chapter 2

Obviously 0, 1, 2, 3 ..., 9 can be ordered. What am I misunderstanding? Is it because the ordering of these digits doesn't aid in classification?

Upvotes: 0

Views: 71

Answers (1)

desertnaut
desertnaut

Reputation: 60319

The key word here is handwritten.

When we are trying to classify the images of the handwritten digits (MNIST), the arithmetic values of the actual digits (and, as a consequence, their ordering) is not part of the classification problem; in it, class (i.e. digit) "9" is not "greater" than class "8" (it is not "less" either), and the distance between class "9" and class "8" is the same with the distance between "9" and "3" (in fact, it is the same between all pairs of classes). In other words, the digits are treated just as categorical variables.

Put it differently, the classification methodology here is identical with what we would use to classify, say, handwritten letters, which of course have not any ordering in the arithmetic sense (no letter is "greater" or "less" than any other).

Another possibly useful analogy is between the number 9 and the character '9'; in fact, in the handwritten digit classification we are dealing with the second, and not with the numbers. And characters/strings, like letters, do not come with any arithmetic ordering.

The case is the same, for example, with the iris dataset, or in problems where we are trying to predict gender (male/female).

There are classification problems where the label, although categorical, is also ordinal (i.e. they are ordered), e.g. something like high/medium/low; but classifying the MNIST digits does not fall under this category - it's all about pattern recognition & discrimination of the digit images, without any use of their actual values or ordering.

Upvotes: 1

Related Questions