orestis giokas
orestis giokas

Reputation: 11

Is there a way to use multilabel classification but take as correct when the model predicts only one label in keras?

I have a dataset of weather forecasts and am trying to make a model that predicts which forecast will be more accurate the next day.

In order to do so, my y output is of the form y=[1,0,1,0] because I have the forecasts of 4 different organizations. 1 represents that this is the best forecast for the current record and more 'ones' means that multiple forecasts had the same best prediction.

My problem is that I want to create a model that trains on these data but also learns that only predicting 1 value correctly is 100% correct answer as I only need to get as a result one of the best and equal forecasts. I believe that the way I am doing this 'shaves' accuracy from my evaluation. Is there a way to implement this in keras? The architecture of the neural network is totally experimental and there is no specific reason why I chose it. This is the code I wrote. My train dataset consists of 6463 rows × 505 columns.

model = Sequential()

model.add(LSTM(150, activation='relu',activity_regularizer=regularizers.l2(l=0.0001)))
model.add(Dense(100,  activation='relu'))
model.add(Dense(100,  activation='relu'))
model.add(Dense(100,  activation='relu'))
model.add(Dense(50,  activation='relu'))
model.add(Dense(50,  activation='relu'))
model.add(Dense(50,  activation='relu'))

model.add(Dense(24,  activation='relu'))
model.add(Dense(4, activation='softmax')) 


#LSTM
# reshape input to be 3D [samples, timesteps, features]
X_train_sc =X_train_sc.reshape((X_train_sc.shape[0], 1, X_train_sc.shape[1]))
X_test_sc = X_test_sc.reshape((X_test_sc.shape[0], 1,X_test_sc.shape[1]))
#validation set
x_val=X_train.iloc[-2000:-1300,0:505]
y_val=y_train[-2000:-1300]

x_val_sc=scaler.transform(x_val)

# reshape input to be 3D for LSTM[samples, timesteps, features]
x_val_sc =x_val_sc.reshape((x_val_sc.shape[0], 1, x_val_sc.shape[1]))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
history= model.fit(x=X_train_sc, y=y_train ,validation_data=(x_val_sc,y_val), epochs=300, batch_size=24)
print(model.evaluate(X_test_sc,y_test))
yhat= model.predict(X_test_sc)

My accuracy is ~44%

Upvotes: 1

Views: 2000

Answers (1)

mujjiga
mujjiga

Reputation: 16916

If you want to make prediction of form [1,0,1,0] ie. the model should predict the probabiliyt of belong to each of the 4 classes then it is called multi-label classification. What you have coded for is multi-class classification.

Muti-label classification

Your last layer will be a dense layers of size 4 for each class, with sigmod activation. You will use a binary_crossentropy loss.

x = np.random.randn(100,10,1)
y = np.random.randint(0,2,(100,4))

model = keras.models.Sequential()

model.add(keras.layers.LSTM(16, activation='relu', input_shape=(10,1), return_sequences=False))
model.add(keras.layers.Dense(8,  activation='relu'))
model.add(keras.layers.Dense(4, activation='sigmoid')) 

model.compile(optimizer='adam', loss='binary_crossentropy')
model.fit(x,y)

Check

print (model.predict(x))

Output

array([[0.5196002 , 0.52978194, 0.5009601 , 0.5036485 ],
       [0.508756  , 0.5189857 , 0.5022978 , 0.50169533],
       [0.5213044 , 0.5254892 , 0.51159555, 0.49724004],
       [0.5144601 , 0.5264933 , 0.505496  , 0.5008205 ],
       [0.50524575, 0.5147699 , 0.50287664, 0.5021702 ],
       [0.521035  , 0.53326863, 0.49642274, 0.50102305],
.........

As you can see the probabilities for each prediction do not sum up to one, rather each value is a probability of it belonging to the corresponding class. So if the probability > 0.5 you can say that it belong to the class.

On the other hand if you use softmax, the probabilies sum up to 1 ie. it belongs to the single class for which it has value > 0.5.

Upvotes: 1

Related Questions