Reputation: 1023
In the original paper on Proximal Policy Optimization Algorithms
in equation (4) the authors use an operation denoted by KL[]
. Unfortunately, they never give a definition for it.
My question:
What does the
KL[]
operation stand for?
Upvotes: 1
Views: 83
Reputation: 1362
Maybe it's KL divergence?
KL divergence is used to compare differences between two probability distribution.
Upvotes: 3