Reputation: 19
Recently, I have learned decoder-encoder network and attention mechanism, and found that many papers and blogs implement attention mechanism on RNN network.
I am interested if other networks can incorporate attentional mechanisms.For example, the encoder is a feedforward neural network and decoder is an RNN. Can feedforward neural networks without time series use attentional mechanisms? If you can, please give me some suggestions.Thank you in advance!
Upvotes: 1
Views: 1479
Reputation: 869
In general Feed forward networks treat features as independent; convolutional networks focus on relative location and proximity; RNNs and LSTMs have memory limitations and tend to read in one direction.
In contrast to these, attention and the transformer can grab context about a word from distant parts of a sentence, both earlier and later than the word appears, in order to encode information to help us understand the word and its role in the system called sentence.
There is a good model for feed-forward network with attention mechanism here:
https://arxiv.org/pdf/1512.08756.pdf
hope to be useful.
Upvotes: 1
Reputation: 831
Yes it is possible to use attention / self- attention / multi-head attention mechanisms to other feed forward networks. It is also possible to use attention mechanisms with CNN based architectures i.e which part of images should be paid more attention while predicting another part of an image. The mail idea behind attention is giving weight to all the other inputs while predicting a particular output or how we correlate words in a sentence for a NLP problem . You can read about the really famous Transformer architecture which is based on self-attention and has no RNN in it. For getting a gist of different type of attention mechanism you can read this blog.
Upvotes: 0