Aaditya Ura
Aaditya Ura

Reputation: 12679

Attention in Keras : How to add different attention mechanism in keras Dense layer?

I am new in Keras and I am trying to build a simple autoencoder in keras with attention layers :

Here what I tried :

data = Input(shape=(w,), dtype=np.float32, name='input_da')
noisy_data = Dropout(rate=0.2, name='drop1')(data)

encoded = Dense(256, activation='relu',
            name='encoded1', **kwargs)(noisy_data)
encoded = Lambda(mvn, name='mvn1')(encoded)

encoded = Dense(128, activation='relu',
            name='encoded2', **kwargs)(encoded)

encoded = Lambda(mvn, name='mvn2')(encoded)
encoded = Dropout(rate=0.5, name='drop2')(encoded)

encoder = Model([data], encoded)
encoded1 = encoder.get_layer('encoded1')
encoded2 = encoder.get_layer('encoded2')

decoded = DenseTied(256, tie_to=encoded2, transpose=True,
            activation='relu', name='decoded2')(encoded)
decoded = Lambda(mvn, name='new_mv')(decoded)

decoded = DenseTied(w, tie_to=encoded1, transpose=True,
            activation='linear', name='decoded1')(decoded)

And it looks like this:

Layer (type)                 Output Shape              Param #   
data (InputLayer)            (None, 2693)              0         
drop1 (Dropout)              (None, 2693)              0         
encoded1 (Dense)             (None, 256)               689664    
mvn1 (Lambda)                (None, 256)               0         
encoded2 (Dense)             (None, 128)               32896     
mvn2 (Lambda)                (None, 128)               0         
drop2 (Dropout)              (None, 128)               0         
decoded2 (DenseTied)         (None, 256)               256       
mvn3 (Lambda)                (None, 256)               0         
decoded1 (DenseTied)         (None, 2693)              2693      

Where I can add attention layer in this model? should I add after first encoded_output and before second encoded input?

encoded = Lambda(mvn, name='mvn1')(encoded)


encoded = Dense(128, activation='relu',
            name='encoded2', **kwargs)(encoded)

also I was going though this beautiful lib :


They have implemented various types of attention mechanisms but it's for sequential models. How I can add those attention in my model?

I tried with very simple attention :

encoded = Dense(256, activation='relu',
        name='encoded1', **kwargs)(noisy_data)

encoded = Lambda(mvn, name='mvn1')(encoded)

attention_probs = Dense(256, activation='softmax', name='attention_vec')(encoded)
attention_mul = multiply([encoded, attention_probs], name='attention_mul')
attention_mul = Dense(256)(attention_mul)


encoded = Dense(128, activation='relu',
        name='encoded2', **kwargs)(attention_mul)

is it at right place and can I add any other attention mechanism with this model?

Upvotes: 0

Views: 1603

Answers (1)

Saurabh Kumar
Saurabh Kumar

Reputation: 2803

I guess what you're doing is a correct way of adding attention, because attention in itself is nothing but can be visualized as weights of a dense layer. Also, I guess applying attention just after encoder is the right thing to do, as it will apply attention to the most "informative" part of the data distribution necessary for your task.

Upvotes: 0

Related Questions