Marco
Marco

Reputation: 33

way for multiplication of these tensors with gradients

I have a function with two inputs: heat maps and feature maps. The heatmaps have a shape of (20, 14, 64, 64) and the feature maps have a shape of (20, 64, 64, 64). Where 20 is the batch size and 14 is the number of key points. Both heatmaps and feature maps have spatial dimensions of 64x64 and the featuremaps have 64 channels (on the second dimension).

Now I need to multiply each heatmap by each channel of the feature maps. So the first heatmap has to be multiplied by all 64 channels of the feature maps. The second with all channels, and so on.

After that, I should have a tensor of shape (20, 14, 64, 64, 64) on which I need to apply global max-pooling.

The problem is now that I can't create a new tensor to do that, because the gradients of the heatmaps and feature maps must be preserved.

My actual (slow and not-gradient-keeping) code is:

def get_keypoint_representation(self, heatmaps, features):
    heatmaps = heatmaps[0]
    pool = torch.nn.MaxPool2d(features.shape[2])
    features = features[:, None, :, :, :]
    features = features.expand(-1, 14, -1, -1, -1).clone()

    for i in range(self.cfg.SINGLE_GPU_BATCH_SIZE):
        for j in range(self.cfg.NUM_JOINTS):
            for k in range(features.shape[2]):
                features[i][j][k] = torch.matmul(heatmaps[i][j], features[i][j][k])

    gmp = features.amax(dim=(-1, -2))
    return gmp

Overview of the task:

enter image description here

Upvotes: 1

Views: 420

Answers (1)

Ivan
Ivan

Reputation: 40678

Given a tensor of heatmaps hm shaped (b, k, h, w) and a feature tensor fm shaped (b, c, h, w).

You can perform such an operation with a single einsum operator

>>> z = torch.einsum('bkhw,bchw->bkchw', hm, FM)
>>> z.shape
torch.Size([20, 14, 64, 64, 64])

Then follow with a max-pooling operation over the spatial dimensions using amax:

 >>> gmp = z.amax(dim=(-1,-2))
 >>> gmp.shape
 torch.Size([20, 14, 64])

Upvotes: 1

Related Questions