Understanding log_prob for Normal distribution in pytorch

Question

I'm currently trying to solve Pendulum-v0 from the openAi gym environment which has a continuous action space. As a result, I need to use a Normal Distribution to sample my actions. What I don't understand is the dimension of the log_prob when using it :

import torch
from torch.distributions import Normal 

means = torch.tensor([[0.0538],
        [0.0651]])
stds = torch.tensor([[0.7865],
        [0.7792]])

dist = Normal(means, stds)
a = torch.tensor([1.2,3.4])
d = dist.log_prob(a)
print(d.size())

I was expecting a tensor of size 2 (one log_prob for each actions) but it output a tensor of size(2,2).

However, when using a Categorical distribution for discrete environment the log_prob has the expected size:

logits = torch.tensor([[-0.0657, -0.0949],
        [-0.0586, -0.1007]])

dist = Categorical(logits = logits)
a = torch.tensor([1, 1])
print(dist.log_prob(a).size())

give me a tensor a size(2).

Why is the log_prob for Normal distribution of a different size ?

Understanding log_prob for Normal distribution in pytorch

Answers (1)

Related Questions