progan01
progan01

Reputation: 11

Monte Carlo Tree Search Alternating

Could anybody please clarify how (as I have not found any clear example anywhere) The MCTS algorithm iterates for the second player.

Everything I seem just seems to look like it is playing eg P1 move every time. I understand the steps for one agent but I never find anything showing code where P2 places its counter, which surely must happen when growing the tree.

Essentially I would expect:

for each iter:

select node Player1 expand Player1

select node Player2 expand player 2

rollout backpropogate

next iter

Is this right?? Could anybody please spell out some psuedocode showing that? Either iteratively or recursion i don't mind.

Thanks for any help.

Upvotes: 0

Views: 475

Answers (1)

Shihab Shahriar Khan
Shihab Shahriar Khan

Reputation: 5455

The trick is in backpropagation part, where you update "wins" variable from the point of view of player whose move led into this position.

Code for MCTS

Notice under UCT function, specially the comments:

 #Backpropagate
    while node != None: # backpropagate from the expanded node and work back to the root node
        node.Update(state.GetResult(node.playerJustMoved)) # state is terminal. Update node with result from POV of node.playerJustMoved
        node = node.parentNode

IF you follow the function call, you would realize visit variable is always updated; wins however, is not.

Upvotes: 1

Related Questions