Nayan
Nayan

Reputation: 9

How to set separate threshold for each classs(multiclass) in model.predict array in keras....any help will be appreciated

Hello all, I am building a model in Keras using convLstm(sequential model) I have used softmax at the last layer to classify 9 labels, but my model predicts only 3 correct predictions after giving good precision & recall. so I want to set the threshold manually for each class label. how can I do that? I have been stuck for more than one day, how can I solve this, thank you.

I have researched it immensely but I can't solve it

Upvotes: 0

Views: 3354

Answers (1)

Rodrigo Laguna
Rodrigo Laguna

Reputation: 1850

When you run model.predict(X), you got an array of shape (n_samples, n_classes).

You can interpret each of those n_classes columns as the output of a binary classifier, which just answer the question "does this example belongs to class i?". So, you can set up different thresholds for each class instead of the regular argmax.

So, assuming class i is encoded as the i-th column of outputs, you can do this:

i=3  # solve for this i=3, later you can generalize
y_pred = model predict(X_val)

th = .5
y_pred_i = np.zeros_like(y_pred[:, i])
y_pred_i[y_pred[:, i] > th] = 1  # set to 1 those above the threshold

print(f1_score(y_val==i, y_pred_i))

Now all you need to do is to try different values for threshold th with a for, and pick up the best according to your metric (for this case I used F1, but you can choose a more suitable for your problem).

You also need to repeat this process for your all your n_classes, and that's all: you end up with a different threshold for each class. This article goes deeper for the binary case, which here you need to replicate once for each class.

Some final notes:

  • When converting a single multiclass problem into multi binary problems, each of those binary problems will be unbalanced, so be careful with the metric you choose to maximize.

  • do not never ever choose the bast threshold based on the score you get in test set: you need to use another validation set to perform the threshold selection, or coroas validate them, but don't do this with your test set, otherwise you're over fitting to test. On the other hand, if you choose them validating over your training set, you will probably overestimate them (training scores tend to be extreme values: near 0 or near 1, while non training are usually more spread).

  • there are new problems arising when doing this re-framing:

    • what if none of the classes reaches to the threshold? You must make a decision: predicting no answer since none of the predictions are good enough according to your thresholds, or returning the one that maximizes your scores, because is the most trustworthy of your options.
    • what if more than one class is above the threshold? May be you can predict more than one class if it is ok in your application, or maybe considering the one with higher score or higher score over threshold.
    • consider also the possibility of calibrating each prediction before choosing thresholds.

Edit: Let me share a working-toy example

Assuming you have only 3 classes, and want to pick up thresholds, such that maximize your f-1 score, the following implementation is based in the usage of precision_recall_curve.

I'll use fake data for y_val, y_pred:

import numpy as np
y_val = np.random.randint(0,3, size=50)
y_pred = np.random.uniform(size=(50,3))

# force some correlation between predictions and target
for i in range(50):
    y_pred[i, y_val[i]] += np.random.uniform(.1,.2)

Now that we have invented some data, you can choose best thresholds as follows:

_, n_classes = y_pred.shape

for i in range(n_classes):

    # Computing best threshold for i-th class
    precision, recall, thresholds = precision_recall_curve(y_val, y_pred[:, i], pos_label=i)

    # compute f-1
    f1 = 2 * precision * recall / (precision + recall)

    # pick up the best threshold's index
    best_idx = np.argmax(f1)
    print(f'For class {i} the best possible threshold is {thresholds[best_idx]:.3f} wich leads to f1={f1[best_idx]:.3f}')
    

Which should output something like this:

For class 0 the best possible threshold is 0.185 wich leads to f1=0.585
For class 1 the best possible threshold is 0.831 wich leads to f1=0.571
For class 2 the best possible threshold is 0.259 wich leads to f1=0.590

Then, to make a prediction, you need to solve the issues I mention before.

Here goes a simple example:

# I took those thresholds from the previous run
th0, th1, th2 = 0.185, 0.831, 0.259

y_new_pred = np.random.uniform(size=(1,3))

if y_new_pred[:, 0] > th0:
    print('this belongs to class 0')

if y_new_pred[:, 1] > th1:
    print('this belongs to class 1')
    
if y_new_pred[:, 2] > th1:
    print('this belongs to class 2')

Note that if you play with them a little, you will find some cases where nothing is printed (i.e. all predictions are below your thresholds) and some other cases when more than one prediction is printed (i.e. your examples could be in more than one class).

How to fix those cases depends on your use case.

Upvotes: 3

Related Questions