Cutoff on Neural Network regression predictions

Question

Context: I have a set of documents, each of them with two associated probability values: probability to belong to class A or and probability to belong to class B. The classes are mutually exclusive, and the probabilities add up to one. So, for instance document D has probabilities (0.6, 0.4) associated as ground truth.

Each document is represented by the tfidf of the terms that it contains, normalized from 0 to 1. I also tried doc2vec (normalized form -1 to 1) and a couple of other methods.

I built a very simple Neural Network to predict this probability distribution.

Input layer with as many nodes as features
Single hidden layer with one node
Output layer with softmax and two nodes
Cross entropy loss function
I also tried with different update functions and learning rates

This is the code I wrote using nolearn:

net = nolearn.lasagne.NeuralNet(
    layers=[('input', layers.InputLayer),
        ('hidden1', layers.DenseLayer),
        ('output', layers.DenseLayer)],
    input_shape=(None, X_train.shape[1]),
    hidden1_num_units=1,
    output_num_units=2,
    output_nonlinearity=lasagne.nonlinearities.softmax,
    objective_loss_function=lasagne.objectives.binary_crossentropy,
    max_epochs=50,
    on_epoch_finished=[es.EarlyStopping(patience=5, gamma=0.0001)],
    regression=True,
    update=lasagne.updates.adam,
    update_learning_rate=0.001,
    verbose=2)
net.fit(X_train, y_train)
y_true, y_pred = y_test, net.predict(X_test)

My problem is: my predictions have a cutoff point and no prediction goes below that point (check the picture to understand what I mean). This plot shows the difference between the true probability and my predictions. The closer a point is to the red line the better the prediction is. Ideally all the points would lie on the line. How can I solve this and why is this happening?

Edit: actually I solved the problem by simply removing the hidden layer:

net = nolearn.lasagne.NeuralNet(
    layers=[('input', layers.InputLayer),
        ('output', layers.DenseLayer)],
    input_shape=(None, X_train.shape[1]),
    output_num_units=2,
    output_nonlinearity=lasagne.nonlinearities.softmax,
    objective_loss_function=lasagne.objectives.binary_crossentropy,
    max_epochs=50,
    on_epoch_finished=[es.EarlyStopping(patience=5, gamma=0.0001)],
    regression=True,
    update=lasagne.updates.adam,
    update_learning_rate=0.001,
    verbose=2)
net.fit(X_train, y_train)
y_true, y_pred = y_test, net.predict(X_test)

But I still fail to understand why I had this problem and why removing the hidden layer solved it. Any ideas?

Here the new plot:

Cutoff on Neural Network regression predictions

Answers (1)

Related Questions