Zeshan Akber
Zeshan Akber

Reputation: 1

Adding an attention block in deep neural network issue for regression problem

I want to add an tf.keras.layers.MultiHeadAttention inside the two layers of neural network. However, I am getting IndexError:

The detailed code are as follow

x1 = Dense(58, activation='relu')(x1)
x1 = Dropout(0.1)(x1)
print(x1.shape)
attention = tf.keras.layers.MultiHeadAttention(num_heads=2, key_dim=58,
            dropout=0.1,output_shape=x1.shape)(x1,x1)

x1 = Dropout(0.2)(attention)
x1 = Dense(59, activation='relu')(x1)
output = Dense(1, activation='linear')(x1)` 
model = tf.keras.models.Model(inputs=input1, outputs=output)

In the above code I am getting following error

IndexError: Exception encountered when calling layer 'softmax' (type Softmax).

tuple index out of range

Call arguments received by layer 'softmax' (type Softmax):
  • inputs=tf.Tensor(shape=(None, 2), dtype=float32)
  • mask=None
Note that
`x1.shape`= (None, 58)

Upvotes: 0

Views: 391

Answers (1)

Zeshan Akber
Zeshan Akber

Reputation: 1

The problem is solved now. MultiHeadAttention layer in TensorFlow expects a 3D input tensor. Therefor to introduce an attention block into normal neural network, there is need to set inputs and outputs of that block accordingly. So the updated code is as follow

    x1 = Dense(58, activation='relu')(x1)
    x1 = Dropout(0.1)(x1)
    x1 = tf.expand_dims(x1, axis=1) # here we need to expand dimension
    print(x1.shape)

    attention = tf.keras.layers.MultiHeadAttention(num_heads=3, key_dim=x1.shape[2], dropout=0.2)(x1, x1)
    x1 = Dropout(0.2)(attention)
    x1 = tf.keras.layers.LayerNormalization()(x1)
    x1 = tf.squeeze(x1, axis=1) # set dimension here again
    x1 = Dense(10, activation='relu')(x1)
    output = Dense(1, activation='linear')(x1)

Upvotes: 0

Related Questions