user882763
user882763

Reputation: 51

multiplying each element of a matrix by a vector (or array)

Say I have a an input array of size (64,100)

t = torch.randn((64,100))

Now say I want to multiply each of the 6400 elements of t with 6400 separate vectors each of size 256 to produce a tensor of size [64, 100, 256]. This is what I am doing currently -

import copy
def clones(module, N):
    "Produce N identical layers."
    return nn.ModuleList([copy.deepcopy(module) for _ in range(N)])

linears = clones(nn.Linear(1,256, bias=False), 6400)

idx = 0
t_final = []
for i in range(64):
    t_bs = []
    for j in range(100):
        t1 = t[i, j] * linears[idx].weight.view(-1)
        idx += 1
        t_bs.append(t1)
    t_bs = torch.cat(t_bs).view(1, 100, 256)
    t_final.append(t_bs)
t_final = torch.cat(t_final)
print(t_final.shape)
Output: torch.Size([64, 100, 256])

Is there a faster and cleaner way of doing the same thing? I tried torch.matmul and torch.dot but couldn't do any better.

Upvotes: 0

Views: 204

Answers (2)

Kaushik Roy
Kaushik Roy

Reputation: 1685

You don't actually need to clone your linear layer if you really want to multiply tenor t with the same weight of linear layer for 6400 times. rather you can do the following:

t = torch.randn((64,100)).unsqueeze(-1)
w = torch.rand((256)).view(1,1,256).repeat(64, 100, 1)
#or
w = torch.stack(6400*[torch.rand((256))]).view(64,100,256)
result = t*w  # shape: [64, 100, 256]

However, If your want to keep the same structure you currently have, then you can do something following:

t = torch.randn((64,100)).unsqueeze(-1)
w = torch.stack([linears[i].weight for i in range(len(linears))]).view(64,100,256)
result = t*w  # shape: [64, 100, 256]

Upvotes: 0

hpwww
hpwww

Reputation: 565

It seems broadcast is what you are looking for.

t = torch.randn((64,100)).view(6400, 1)
weights = torch.randn((6400, 256))

output = (t * weights).view(64, 100, 256)

Upvotes: 2

Related Questions