Reputation:

LSTM prediction probability with sigmoid is low and so is AUC

I have a timeseries data of 3228 Patients and I am doing a disease (Sepsis) forecasting using LSTM Functional API and my data looks like this. Access the whole data here and the whole code here.

 HR     O2Sat   Temp    SBP      DBP    Resp    Age  Gender  ICULOS  SepsisLabel    P_ID
80.2    97.4    36.40   127.5   70.4    17.9    46     0        1         0        15888
95.0    97.0    36.85   102.5   63.4    30.0    46     0        2         0        15888
97.0    96.0    36.85   108.0   67.6    30.0    46     0        3         0        15888
97.0    95.0    37.04   102.0   63.1    32.0    46     0        4         0        15888
98.0    96.0    36.99   108.0   66.2    30.0    46     0        5         0        15888

My goal is to predict the probability of a patient of being Septic (SepsisLabel = 1) or Non-Septic (SepsisLabel = 0). So, I divided the data into 80:20 ratio with 2583 Patients for training and 645 patients for testing based on Patient ID. Then I converted them to x-train, x-test, y-train, y-test and then to Multi-Index dataframe so I can convert it to a fixed length tensor of 72 with Post padding. Now, each patient have its own tensor and these tensors are fed to LSTM model for making predictions. After training the model for 100 epochs, the training accuracy is not increasing from 0.86. I then predicted the model on test data and the prediction probabilities are very low, So is the AUC. Following is the code:

my_callbacks = [
    tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=15),
    tf.keras.callbacks.ModelCheckpoint(filepath='model.{epoch:02d}-{val_loss:.2f}.h5', save_best_only=True, save_weights_only=True, monitor='val_accuracy'),
]
from keras.callbacks import EarlyStopping, ModelCheckpoint
# es = EarlyStopping(monitor='val_loss', patience=15)
# ckpt = ModelCheckpoint('modl.h5', save_best_only=True, save_weights_only=True, monitor='val_accuracy')

# construct inputs
x = Input((None, x_train.shape[-1]) , name='input')
mask = Masking(0, name='input_masked')(x)

# stack LSTMs
lstm_kwargs = {'dropout': 0.20, 'recurrent_dropout': 0.1, 'return_sequences': True, 'implementation': 2}
lstm1 = LSTM(200, name='lstm1', **lstm_kwargs)(mask)
lstm2 = LSTM(200, name='lstm2', **lstm_kwargs)(lstm1)

btch = BatchNormalization()(lstm2)

# output: sigmoid layer
output = TimeDistributed(Dense(1, activation='sigmoid'), name='output')(btch)
model = Model(inputs=x, outputs=output)

# compile model
optimizer = RMSprop(learning_rate=0.001)
model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics = ['accuracy'])

# print layer shapes and model parameters
model.summary()

preds = model.predict(x_test)
# Output
array([[[0.17022534],
        [0.18894851],
        [0.20475313],
        ...,
        [0.2432414 ],
        [0.2432414 ],
        [0.2432414 ]],
        ...,
        [0.08733492],
        [0.08733492],
        [0.08733492]]], dtype=float32)

The AUC:

How I can improve the model to get higher probabilities and high AUC? I tried without BatchNormalization() but no increase in probabilities. I tried without mask(), I tried increasing LSTM Layers, I tried by changing optimizer to Adam, even changing learning rate but no better results.

Upvotes: 0

Answers (2)

ags29

Reputation: 2696

Had a look at your data and the reason you keep hitting 0.86 accuracy is that the incidence of the positives in your data is roughly ~14%. Accuracy is probably not the best metric to track here given the imbalance (perhaps area under precision-recall curve would be better). You might also try using the sample_weight argument of the fit function to weight your samples and counteract the imbalance issue.

Upvotes: 1

Kilian

Reputation: 501

What I would try, but no promises :) Feed the hidden layers of the LSTM into a dense layer and then to the binary output.

Upvotes: 0

LSTM prediction probability with sigmoid is low and so is AUC

The AUC:

Answers (2)

Related Questions