Reputation: 1007
I am new to RL and the best I've done is CartPole in openAI gym. In cartPole, the API automatically provides the reward given the action taken. How am I supposed to decide the reward when all I have is pixel data and no "magic function" that could tell the reward for a certain action.
Say, I want to make a self driving bot in GTA San Andreas. The input I have access to are raw pixels. How am I supposed to figure out the reward for a certain action it takes?
Upvotes: 0
Views: 379
Reputation: 2563
You need to make up a reward that proxies the behavior you want - and that is actually no trivial business.
If there is some numbers on a fixed part of the screen representing score, then you can use old fashioned image processing techniques to read the numbers and let those be your reward function.
If there is a minimap in a fixed part of the screen with fixed scale and orientation, then you could use minus the distance of your character to a target as reward.
If there are no fixed elements in the UI you can use to proxy the reward, then you are going to have a bad time, unless you can somehow access the internal variables of the console to proxy the reward (using the position coordinates of your PC, for example).
Upvotes: 2