Yingqiang Gao
Yingqiang Gao

Reputation: 999

pytorch masked_fill: why can't I mask all zeros?

I want to mask the all the zeros in the score matrix with -np.inf, but I can only get part of zeros masked, looked like

enter image description here

you see in the upper right corner there are still zeros that didn't get masked with -np.inf

Here's my codes:

q = torch.Tensor([np.random.random(10),np.random.random(10),np.random.random(10), np.random.random(10), np.zeros((10,1)), np.zeros((10,1))])
k = torch.Tensor([np.random.random(10),np.random.random(10),np.random.random(10), np.random.random(10), np.zeros((10,1)), np.zeros((10,1))])
scores = torch.matmul(q, k.transpose(0,1)) / math.sqrt(10)
mask = torch.Tensor([1,1,1,1,0,0])
mask = mask.unsqueeze(1)
scores = scores.masked_fill(mask==0, -np.inf)

Maybe the mask is wrong?

Upvotes: 4

Views: 15461

Answers (4)

pourya
pourya

Reputation: 67

or even by changing your code a little bit, it'll work

import math

q = torch.Tensor([np.random.random(10),np.random.random(10),np.random.random(10), np.random.random(10), np.zeros((10,1)), np.zeros((10,1))])
k = torch.Tensor([np.random.random(10),np.random.random(10),np.random.random(10), np.random.random(10), np.zeros((10,1)), np.zeros((10,1))])
scores = torch.matmul(q, k.transpose(0,1)) / math.sqrt(10)
mask = torch.Tensor([1,1,1,1,0,0])
mask2 = mask.unsqueeze(1)
scores = scores.masked_fill(mask2==0, -np.inf)
mask = mask.unsqueeze(0)
scores = scores.masked_fill(mask==0, -np.inf)
scores

Upvotes: 0

Vivek Tyagi
Vivek Tyagi

Reputation: 51

Ying your code is right and the output is showing the right behaviour. Currently your mask has shape [6,1] ands hence it masks the last two elements in each column first.

>>> mask = torch.Tensor([1,1,1,1,0,0])

>>> mask.shape

torch.Size([6])

>>> mask = mask.unsqueeze(1)

>>> mask.shape

torch.Size([6, 1])

Upvotes: 1

Vivek Tyagi
Vivek Tyagi

Reputation: 51

In mujjiga's code, scores tensor is itself used as mask and hence it will replace all 0's as -inf though that is not the usual intended use of a mask. A mask is generally independent of the tensor which one would want to mask.

Upvotes: 4

mujjiga
mujjiga

Reputation: 16856

Your mask is wrong. Try

scores = scores.masked_fill(scores == 0, -np.inf)

scores now looks like

tensor([[1.4796, 1.2361, 1.2137, 0.9487,   -inf,   -inf],
        [0.6889, 0.4428, 0.6302, 0.4388,   -inf,   -inf],
        [0.8842, 0.7614, 0.8311, 0.6431,   -inf,   -inf],
        [0.9884, 0.8430, 0.7982, 0.7323,   -inf,   -inf],
        [  -inf,   -inf,   -inf,   -inf,   -inf,   -inf],
        [  -inf,   -inf,   -inf,   -inf,   -inf,   -inf]])

Upvotes: 10

Related Questions