mean cost per game in Guided Cost Learning

Question

In this algorithm, which is an implementation of the paper "Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization" from Chelsea Finn, Sergey Levine and Pieter Abbeel, why is the mean cost per 1 game increasing instead of decreasing? I've looked for it and i don't really get it.

the cost is computed in this class:

class CostNN(nn.Module):
    def __init__(
        self, 
        state_dim,
        hidden_dim1 = 128, 
        out_features = 1, 
    ):
        super(CostNN, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(state_dim, hidden_dim1),
            nn.ReLU(),
            nn.Linear(hidden_dim1, out_features),
        )
    def forward(self, x):
        return self.net(x)

Results

but can't figure out if this behavior is a "normal one"

mean cost per game in Guided Cost Learning

Answers (0)

Related Questions