Reputation:
I have a timeseries data of 3228 Patients and I am doing a disease (Sepsis) forecasting using LSTM
Functional API and my data looks like this. Access the whole data here and the whole code here.
HR O2Sat Temp SBP DBP Resp Age Gender ICULOS SepsisLabel P_ID
80.2 97.4 36.40 127.5 70.4 17.9 46 0 1 0 15888
95.0 97.0 36.85 102.5 63.4 30.0 46 0 2 0 15888
97.0 96.0 36.85 108.0 67.6 30.0 46 0 3 0 15888
97.0 95.0 37.04 102.0 63.1 32.0 46 0 4 0 15888
98.0 96.0 36.99 108.0 66.2 30.0 46 0 5 0 15888
My goal is to predict the probability
of a patient of being Septic (SepsisLabel = 1) or Non-Septic (SepsisLabel = 0). So, I divided the data into 80:20
ratio with 2583 Patients for training
and 645 patients for testing
based on Patient ID. Then I converted them to x-train, x-test, y-train, y-test
and then to Multi-Index
dataframe so I can convert it to a fixed length tensor
of 72
with Post padding
. Now, each patient have its own tensor and these tensors are fed to LSTM model for making predictions. After training the model for 100 epochs
, the training accuracy is not increasing from 0.86
. I then predicted the model on test data and the prediction probabilities are very low, So is the AUC. Following is the code:
my_callbacks = [
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=15),
tf.keras.callbacks.ModelCheckpoint(filepath='model.{epoch:02d}-{val_loss:.2f}.h5', save_best_only=True, save_weights_only=True, monitor='val_accuracy'),
]
from keras.callbacks import EarlyStopping, ModelCheckpoint
# es = EarlyStopping(monitor='val_loss', patience=15)
# ckpt = ModelCheckpoint('modl.h5', save_best_only=True, save_weights_only=True, monitor='val_accuracy')
# construct inputs
x = Input((None, x_train.shape[-1]) , name='input')
mask = Masking(0, name='input_masked')(x)
# stack LSTMs
lstm_kwargs = {'dropout': 0.20, 'recurrent_dropout': 0.1, 'return_sequences': True, 'implementation': 2}
lstm1 = LSTM(200, name='lstm1', **lstm_kwargs)(mask)
lstm2 = LSTM(200, name='lstm2', **lstm_kwargs)(lstm1)
btch = BatchNormalization()(lstm2)
# output: sigmoid layer
output = TimeDistributed(Dense(1, activation='sigmoid'), name='output')(btch)
model = Model(inputs=x, outputs=output)
# compile model
optimizer = RMSprop(learning_rate=0.001)
model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics = ['accuracy'])
# print layer shapes and model parameters
model.summary()
preds = model.predict(x_test)
# Output
array([[[0.17022534],
[0.18894851],
[0.20475313],
...,
[0.2432414 ],
[0.2432414 ],
[0.2432414 ]],
...,
[0.08733492],
[0.08733492],
[0.08733492]]], dtype=float32)
How I can improve the model to get higher probabilities and high AUC? I tried without BatchNormalization()
but no increase in probabilities. I tried without mask()
, I tried increasing LSTM Layers
, I tried by changing optimizer to Adam
, even changing learning rate
but no better results.
Upvotes: 0
Views: 479
Reputation: 2696
Had a look at your data and the reason you keep hitting 0.86
accuracy is that the incidence of the positives in your data is roughly ~14%. Accuracy is probably not the best metric to track here given the imbalance (perhaps area under precision-recall curve would be better). You might also try using the sample_weight
argument of the fit
function to weight your samples and counteract the imbalance issue.
Upvotes: 1
Reputation: 501
What I would try, but no promises :) Feed the hidden layers of the LSTM into a dense layer and then to the binary output.
Upvotes: 0