Reputation: 19
In this algorithm, which is an implementation of the paper "Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization" from Chelsea Finn, Sergey Levine and Pieter Abbeel, why is the mean cost per 1 game increasing instead of decreasing? I've looked for it and i don't really get it.
the cost is computed in this class:
class CostNN(nn.Module):
def __init__(
self,
state_dim,
hidden_dim1 = 128,
out_features = 1,
):
super(CostNN, self).__init__()
self.net = nn.Sequential(
nn.Linear(state_dim, hidden_dim1),
nn.ReLU(),
nn.Linear(hidden_dim1, out_features),
)
def forward(self, x):
return self.net(x)
Results
but can't figure out if this behavior is a "normal one"
Upvotes: 1
Views: 52