Reputation: 11
I have a dataset of weather forecasts and am trying to make a model that predicts which forecast will be more accurate the next day.
In order to do so, my y output is of the form y=[1,0,1,0] because I have the forecasts of 4 different organizations. 1 represents that this is the best forecast for the current record and more 'ones' means that multiple forecasts had the same best prediction.
My problem is that I want to create a model that trains on these data but also learns that only predicting 1 value correctly is 100% correct answer as I only need to get as a result one of the best and equal forecasts. I believe that the way I am doing this 'shaves' accuracy from my evaluation. Is there a way to implement this in keras? The architecture of the neural network is totally experimental and there is no specific reason why I chose it. This is the code I wrote. My train dataset consists of 6463 rows × 505 columns.
model = Sequential()
model.add(LSTM(150, activation='relu',activity_regularizer=regularizers.l2(l=0.0001)))
model.add(Dense(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(24, activation='relu'))
model.add(Dense(4, activation='softmax'))
# reshape input to be 3D [samples, timesteps, features]
X_train_sc =X_train_sc.reshape((X_train_sc.shape[0], 1, X_train_sc.shape[1]))
X_test_sc = X_test_sc.reshape((X_test_sc.shape[0], 1,X_test_sc.shape[1]))
#validation set
# reshape input to be 3D for LSTM[samples, timesteps, features]
x_val_sc =x_val_sc.reshape((x_val_sc.shape[0], 1, x_val_sc.shape[1]))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
history=, y=y_train ,validation_data=(x_val_sc,y_val), epochs=300, batch_size=24)
yhat= model.predict(X_test_sc)
My accuracy is ~44%
Upvotes: 1
Views: 2000
Reputation: 16916
If you want to make prediction of form [1,0,1,0]
ie. the model should predict the probabiliyt of belong to each of the 4 classes then it is called multi-label classification. What you have coded for is multi-class classification.
Your last layer will be a dense layers of size 4 for each class, with sigmod activation. You will use a binary_crossentropy
x = np.random.randn(100,10,1)
y = np.random.randint(0,2,(100,4))
model = keras.models.Sequential()
model.add(keras.layers.LSTM(16, activation='relu', input_shape=(10,1), return_sequences=False))
model.add(keras.layers.Dense(8, activation='relu'))
model.add(keras.layers.Dense(4, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy'),y)
print (model.predict(x))
array([[0.5196002 , 0.52978194, 0.5009601 , 0.5036485 ],
[0.508756 , 0.5189857 , 0.5022978 , 0.50169533],
[0.5213044 , 0.5254892 , 0.51159555, 0.49724004],
[0.5144601 , 0.5264933 , 0.505496 , 0.5008205 ],
[0.50524575, 0.5147699 , 0.50287664, 0.5021702 ],
[0.521035 , 0.53326863, 0.49642274, 0.50102305],
As you can see the probabilities for each prediction do not sum up to one, rather each value is a probability of it belonging to the corresponding class. So if the probability > 0.5 you can say that it belong to the class.
On the other hand if you use softmax
, the probabilies sum up to 1 ie. it belongs to the single class for which it has value > 0.5.
Upvotes: 1