Can torch.where() used in a equivalent broadcsating form?

Question

I have the following segment of for loop in my code. The nested loop is slowing down my complete execution.

for q in range(batchSize):
    temp=torch.where((composition_matrix == pred[q]).all(dim=1))[0]
    if len(temp)==0:
        output[q]=0
    else:
        output[q]=int(temp[0])

Here, composition_matrix is [14000,2] dimensional pytorch tensor with only positive integers as cell values. pred and output both are a [batchSize,2] dimensional torch tensor. As this for loop is slowing my code a lot and I am unable to get the equivalent broadcasting solution to this code segment.

Does a broadcasting solution exists to eleminate this for loop?

I shall be grateful for any help.

A minimum reproducible example is

import torch
composition_matrix=torch.randint(3, 10, (14000,2))
batchSize=64
pred=torch.randint(3, 10, (batchSize,2))
output=torch.zeros([batchSize])

for q in range(batchSize):
    temp=torch.where((composition_matrix == pred[q]).all(dim=1))[0]
    if len(temp)==0:
        output[q]=0
    else:
        output[q]=int(temp[0])

Mercury · Accepted Answer

To make it simple, you first need to understand what the operation is essentially doing. You've got two tensors. Tensor A is of shape (14000, 2) and tensor B is of shape (64, 2). The operation you want to do is:

For each row B[i] in B, compare that B[i] (of shape (2,) with A (of shape (14000, 2)). If B[i] occurs within A, set output[i] = index of first occurrence.

This can actually be done in two lines of code (maybe even one line):

comp = (composition_matrix[:, None, :] == pred).all(dim=-1)
output = torch.argmax(comp.float(), axis=0)

The first line creates comp, the broadcasted comparison of composition_matrix and pred, a boolean tensor of shape (14000, 64).
The second line needs to find the "index of the first match". This can be done quite simply with argmax: it will return the index of the first "1" (or if all the values are "0", will return the first index, ie, 0).

(Note that torch does not support argmax for "bool" tensors, and so comp needed to be cast to another data type.)

Can torch.where() used in a equivalent broadcsating form?

Answers (2)

Related Questions