StackOverflow Questions for Tag: self-attention

sarvenaz

Reputation: 11

Non deterministic behavior of a CNN network after adding self attention

tensorflowmachine-learningkerasconv-neural-networkself-attention

Score: 1

Answers: 0

cangozpi

Reputation: 159

Trouble understanding the formula for estimating dense self-attention FLOPS per Token given as 6LH(2QT)

large-language-modeltransformer-modelattention-modelself-attention

Score: 0

Answers: 0

BigWinnz101

Reputation: 63

Failing to Finalize Execution Plan Using cuDNN Backend to Create a Fused Attention fprop Graph

c++cudnnself-attentionmultihead-attention

Score: -4

Answers: 1

Matija Špeletić

Reputation: 146

How to solve "Torch was not compiled with flash attention" warning?

pytorchself-attentionvision-transformer

Score: 7

Answers: 2

killermama98

Reputation: 45

Receptive Field in Swin Transformer

conv-neural-networktransformer-modelattention-modelself-attentionreceptive-field

Score: 1

Answers: 2

chungking youjun

Reputation: 1

flash attention gives different result for tokens of identical embeddings?

pythonpytorchhuggingface-transformerstransformer-modelself-attention

Score: 0

Answers: 0

Chiara

Reputation: 490

How to read a BERT attention weight matrix?

huggingface-transformersbert-language-modelattention-modelself-attentionmultihead-attention

Score: 3

Answers: 2

jun j

Reputation: 1

How to out put attentions in Transformers BART model

pythontransformer-modelself-attentionbart

Score: 0

Answers: 1

phd Mom

Reputation: 11

multihead self-attention for sentiment analysis not accurate results

heatmapsentiment-analysisattention-modelself-attentionmultihead-attention

Score: 1

Answers: 0

MrGeniusProgrammer

Reputation: 11

cannot back propagate on multi head attention tensorflowjs

tensorflowtensorflow.jstransformer-modelself-attentionmultihead-attention

Score: 1

Answers: 0

Rohollah

Reputation: 1

How to handle None values for condition_input in a CrossAttention block in PyTorch?

pythondeep-learningpytorchtransformer-modelself-attention

Score: 0

Answers: 0

Peter

Reputation: 9

PyTorch Vision Transformer - How Visualise Attention Layers

pythonpytorchself-attentionvision-transformermultihead-attention

Score: 0

Answers: 0

Wassim Jaoui

Reputation: 95

Interpreting the rows and columns of the attention Heatmap

nlpheatmapattention-modelself-attentionmultihead-attention

Score: 0

Answers: 0

Wassim Jaoui

Reputation: 95

Attention Tensor Shape meaning

pythonnlptransformer-modelattention-modelself-attention

Score: 1

Answers: 0

Christina

Reputation: 1

Can we mask two words at the same time in attention mechanism?

word2vecmaskingword-embeddingattention-modelself-attention

Score: 0

Answers: 0

user24820368

Reputation: 1

How to retrieve query/key/values/output parameters of BERT pretrained?

pythontensorflowbert-language-modeltransformer-modelself-attention

Score: 0

Answers: 0

user24606639

Reputation: 1

ValueError: Dimension must be 2 but is 3 (keras Attention)

pythonkerastf.kerasvalueerrorself-attention

Score: 0

Answers: 0

sk-19

Reputation: 13

Attention Mechanism Scores are the same

nlpsentiment-analysisattention-modelself-attentionmultihead-attention

Score: 0

Answers: 0

Razbolt

Reputation: 39

Pytorch- Is it possible to apply self-attention into Faster R-CNN's ROI layer?

pythonpytorchfaster-rcnnself-attention

Score: 0

Answers: 0

thejbug

Reputation: 41

Can tensorflow optimize the extract_patches operation to avoid large redundant tensors?

rtensorflowconv-neural-networkself-attentioneinsum

Score: 0

Answers: 0

PreviousPage 1Next

StackOverflow Questions for Tag: self-attention

Non deterministic behavior of a CNN network after adding self attention

Trouble understanding the formula for estimating dense self-attention FLOPS per Token given as 6LH(2QT)

Failing to Finalize Execution Plan Using cuDNN Backend to Create a Fused Attention fprop Graph

How to solve &quot;Torch was not compiled with flash attention&quot; warning?

Receptive Field in Swin Transformer

flash attention gives different result for tokens of identical embeddings?

How to read a BERT attention weight matrix?

How to out put attentions in Transformers BART model

multihead self-attention for sentiment analysis not accurate results

cannot back propagate on multi head attention tensorflowjs

How to handle None values for condition_input in a CrossAttention block in PyTorch?

PyTorch Vision Transformer - How Visualise Attention Layers

Interpreting the rows and columns of the attention Heatmap

Attention Tensor Shape meaning

Can we mask two words at the same time in attention mechanism?

How to retrieve query/key/values/output parameters of BERT pretrained?

ValueError: Dimension must be 2 but is 3 (keras Attention)

Attention Mechanism Scores are the same

Pytorch- Is it possible to apply self-attention into Faster R-CNN&#39;s ROI layer?

Can tensorflow optimize the extract_patches operation to avoid large redundant tensors?

How to solve "Torch was not compiled with flash attention" warning?

Pytorch- Is it possible to apply self-attention into Faster R-CNN's ROI layer?