Monte Carlo Tree Search Expansion

Question

I hope you are doing well. I am currently working on a project where we need to implement a connect4-agent by using Mcts (Monte Carlo tree search). As far as I've understood, mcts basically consists on four stages:

1) Tree construction

2) Selection by Ucb1 values till we reach a leaf node

3) Expansion if the leaf node has been visited

4) Rollout = Random simulation till terminal state is reached and score this terminal state (e.g we won --> score =1, we lost--> score = -1, draw--> score = 0)

5) Backpropagation of the score value and add one visit to the visited nodes.

6) Decide Move depending on the score values of the 1 level.

Our code is working pretty well. Nevertheless, I am not sure how to perform the expansion stage. If we get to a leaf node that has already been visited, we know we need to expand the tree from this node. Consider you have 3 possible moves. How do we decide if we wanna expand the tree with move=1, move =2 or move = 3 ?

Right now the algorithm randomly chooses one of these moves to expand, but I believe this is far from optimal.

Best regards, Alberto

Monte Carlo Tree Search Expansion

Answers (1)

Related Questions