Hanyu Guo
Hanyu Guo

Reputation: 717

Why doesn't the hidden state of a neuron network provide better dimension reduction result than original input?

I just read a great post here. I am curious about content of "An example with images" in that post. If the hidden states mean a lot of features of the original picture and getting closer to final result, using dimension reduction on hidden states should provide better result than the original raw pixels, I think.

Hence, I tried it on mnist digits with 2 hidden layers of 256 unit NN, using T-SNE for dimension reduction; the result is far from ideal. From left to right, top to bot, they are raw pixels, second hidden layer and final prediction. Can anyone explain that?

BTW, the accuracy of this model is around 94.x%. enter image description here

Upvotes: 1

Views: 83

Answers (1)

Marcin Możejko
Marcin Możejko

Reputation: 40516

You have ten classes, and as you mentioned, your model is performing well on this dataset - so in this 256 dimensional space - the classes are separated well using linear subspaces.

So why T-SNE projections don't have this property?

One trivial answer which comes to my mind is that projecting a highly dimensional set to two dimensions may lose the linear separation property. Consider following example : a hill where one class is at its peak and second - on lower height levels around. In three dimensions these classes are easily separated by a plane but one can easily find a two dimensional projection which doesn't have that property (e.g. projection in which you are loosing the height dimension).

Of course T-SNE is not such linear projections but it's main purpose is to preserve a local structure of data, so that general property like linear separation property might be easly losed when using this approach.

Upvotes: 0

Related Questions