Trouble understanding LSTM output.

Question

I have two data sets that contains sequence of numbers. One of them is my X and the other one is Y.

For example,

X:
1 0 7 2
4 8 2 0
5 9 2 1
.
.
.

Shape of X is: (10000, 4)

Y:
10 24 5 15
7  6  10 4
13 22 6  2
.
.
.
Shape of Y is: (10000, 4)

The values in X lies in the range of 0-10 while the values in Y lies in range 0-24.

I'm using LSTM implementation in Keras to train X and Y. Since LSTM model requires the input to be in 3 dimensional, I preprocessed the data and changed X to (10000, 4, 10) and Y to (10000, 4, 24).

After preprocessing into one-hot encoding (these are the actual data and doesn't represent the one used for example):

X:
[[[1 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 1 0 ... 0 0 0]
  [1 0 0 ... 0 0 0]]

 [[0 0 1 ... 0 0 0]
  [0 0 0 ... 0 1 0]
  [1 0 0 ... 0 0 0]
  [0 0 0 ... 1 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 1 0 0]
  [0 0 0 ... 0 0 0]]

 ...

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 1 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [1 0 0 ... 0 0 0]
  [1 0 0 ... 0 0 0]
  [0 1 0 ... 0 0 0]]]



Y:
[[[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 1 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 1 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 1 0 0]
  [0 0 0 ... 0 0 0]]

 ...

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 1 0 0]]

 [[0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]
  [0 0 0 ... 0 0 0]]]

This is the code for my LSTM model:

    X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2, random_state = 1)
    model = Sequential()
    model.add(LSTM(output_dim = 24, input_shape = X_train.shape[1:], return_sequences = True, init = 'glorot_normal', inner_init = 'glorot_normal', activation = 'sigmoid'))
    model.add(LSTM(output_dim = 24, input_shape = X_train.shape[1:], return_sequences = True, init = 'glorot_normal', inner_init = 'glorot_normal', activation = 'sigmoid'))
    model.add(LSTM(output_dim = 24, input_shape = X_train.shape[1:], return_sequences = True, init = 'glorot_normal', inner_init = 'glorot_normal', activation = 'sigmoid'))
    model.add(LSTM(output_dim = 24, input_shape = X_train.shape[1:], return_sequences = True, init = 'glorot_normal', inner_init = 'glorot_normal', activation = 'sigmoid'))
    model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
    model.fit(X_train, Y_train, nb_epoch = 500, validation_data = (X_test, Y_test))
    model.save('LSTM500.h5')

    predictions = model.predict(X_test)
    predictions = np.array(predictions, dtype = np.float64)
    predictions = predictions.reshape(2000, 4, 24)

Output:

[[[[0.1552688  0.15805855 0.2013046  ... 0.16005482 0.19403476
    0.        ]
   [0.0458279  0.09995601 0.06456595 ... 0.09573169 0.07952237
    0.        ]
   [0.19871283 0.19968285 0.06270849 ... 0.14653654 0.18313469
    0.        ]
   [0.08407309 0.091876   0.09707277 ... 0.12661831 0.12858406
    0.        ]]

  [[0.15482235 0.14433247 0.18260191 ... 0.15641384 0.20746264
    0.        ]
   [0.03096719 0.05375536 0.05373315 ... 0.05018555 0.07592873
    0.        ]
   [0.20420487 0.17884348 0.13145864 ... 0.17901334 0.19768076
    0.        ]
   [0.03465272 0.06732351 0.02182322 ... 0.06144218 0.07827628
    0.        ]]

  [[0.15116604 0.15068266 0.18474537 ... 0.17088319 0.15841168
    0.        ]
   [0.09633015 0.11277901 0.10069521 ... 0.09309217 0.11326427
    0.        ]
   [0.17512578 0.13187788 0.10418645 ... 0.10735759 0.10635827
    0.        ]
   [0.13673681 0.12714103 0.06212005 ... 0.03213149 0.14153068
    0.        ]]

  ...

Shape of the prediction: (1, 2000, 4, 24)

I reshaped the prediction array to (2000, 4, 24).

[[[0.1552688  0.15805855 0.2013046  ... 0.16005482 0.19403476 0.        ]
  [0.0458279  0.09995601 0.06456595 ... 0.09573169 0.07952237 0.        ]
  [0.19871283 0.19968285 0.06270849 ... 0.14653654 0.18313469 0.        ]
  [0.08407309 0.091876   0.09707277 ... 0.12661831 0.12858406 0.        ]]

 [[0.15482235 0.14433247 0.18260191 ... 0.15641384 0.20746264 0.        ]
  [0.03096719 0.05375536 0.05373315 ... 0.05018555 0.07592873 0.        ]
  [0.20420487 0.17884348 0.13145864 ... 0.17901334 0.19768076 0.        ]
  [0.03465272 0.06732351 0.02182322 ... 0.06144218 0.07827628 0.        ]]

 [[0.15116604 0.15068266 0.18474537 ... 0.17088319 0.15841168 0.        ]
  [0.09633015 0.11277901 0.10069521 ... 0.09309217 0.11326427 0.        ]
  [0.17512578 0.13187788 0.10418645 ... 0.10735759 0.10635827 0.        ]
  [0.13673681 0.12714103 0.06212005 ... 0.03213149 0.14153068 0.        ]]

 ...

I do not seem to understand the output that I got. What are these numbers? Shouldn't the prediction array contains values only in 0 and 1 (this way, I can retrieve the actual value) just as Y_test? Thanks.

Mete Han Kahraman · Accepted Answer

Those numbers are the numbers you got from your last layer, which is a sigmoid layer, sigmoid will return values between 0 and 1 which is what we see in your output.

How to interpret these values?

Since you are feeding and looking for one-hot outputs, you could select the maximum number in last axis and get its index value on that axis with np.argmax(prediction, axis = -1) this should give you a numpy array with shape (2000,4) , each element is a number between [0,24) which is the same format as your original data. These values are what your what your LSTM model predicted as the most likely outcome.

Second biggest number would be the second most likely outcome.

Trouble understanding LSTM output.

Answers (1)

Related Questions