lithuak
lithuak

Reputation: 6328

How to get predictions vector from Theano stacked autoencoder

I'm trying to modify Stacked Autoencoder for classification from Theano deep learning tutorial, chapter 8. The code of autoencoder I'm dealing with is available here.

My dataset consists of 4 arrays: test_set_x, test_set_y, valid_set_x, valid_set_y. The names are self-explained.

This is how the trained autoencoder is checked on validation set:

valid_score = the.function([], sda.errors,
                 givens={
                    sda.x: valid_set_x,
                    sda.y: valid_set_y},
                 name='valid_test')

print valid_score()

This code prints out "0.87" on my dataset, so it does work.

Expressing it more verbose

valid_score = the.function([], T.mean(T.neq(sda.logLayer.y_pred, sda.y)),
                 givens={
                    sda.x: valid_set_x,
                    sda.y: valid_set_y},
                 name='valid_test')

still gives correct answer 87%.

But whenever I'm trying to get directly the real class prediction vector, I get some very wrong result: all elements of result vector are equal to 4 (one on my classes).

My try looks like this:

predict = the.function([], sda.logLayer.y_pred,
                   givens={sda.x: valid_set_x})
print predict()

This prints out "[4, 4, 4, ....., 4, 4]". Comparing this result with valid_set_y vector gives about 12% correctness, not even nearly 87%.

I don't understand what I'm doing wrong.

Please help me if you've ever had a deal with theano autoencoders and/or the mentioned tutorial.

Thank you.

Upvotes: 3

Views: 981

Answers (1)

Shai
Shai

Reputation: 114786

The valid_score output is the error rate on the validation set. A validation score of 87% means that you managed to classify correctly only ~12% of your validation examples. This result seems to be consistet with an "all 4" prediction rule.

Upvotes: 2

Related Questions