Reputation: 11
I hope you are doing well. I am currently working on a project where we need to implement a connect4-agent by using Mcts (Monte Carlo tree search). As far as I've understood, mcts basically consists on four stages:
1) Tree construction
2) Selection by Ucb1 values till we reach a leaf node
3) Expansion if the leaf node has been visited
4) Rollout = Random simulation till terminal state is reached and score this terminal state (e.g we won --> score =1, we lost--> score = -1, draw--> score = 0)
5) Backpropagation of the score value and add one visit to the visited nodes.
6) Decide Move depending on the score values of the 1 level.
Our code is working pretty well. Nevertheless, I am not sure how to perform the expansion stage. If we get to a leaf node that has already been visited, we know we need to expand the tree from this node. Consider you have 3 possible moves. How do we decide if we wanna expand the tree with move=1, move =2 or move = 3 ?
Right now the algorithm randomly chooses one of these moves to expand, but I believe this is far from optimal.
Best regards, Alberto
Upvotes: 1
Views: 751
Reputation: 1220
You are supposed to pick the move randomly in MCTS, because at that point it knows nothing about the move. You can add a heuristic and order it but it may bias the search.
Upvotes: 0