Reputation: 724
Has anyone implemented the Deep Q-learning to solve a grid world problem where state is the [x, y] coordinates of the player and goal is to reach a certain coordinate [A, B]. Reward setting could be -1 for each step and +10 for reaching [A,B]. [A, B] is always fixed.
Surprisingly enough I did not find such an implementation on google. I tried DQN using taxi-v3 myself and it didn't work. So, looking for such a reference implementation to work my way up to my problem.
Upvotes: 0
Views: 1717
Reputation: 1029
For grid worlds deep Q-learning isn't needed, that's probably why there are few people doing it. However I found a tutorial that uses deep Q-learning with a grid world: https://livebook.manning.com/book/deep-reinforcement-learning-in-action/chapter-3/1
Upvotes: 2