Monte Carlo Tree Search - "most promising" move function

Question

I tried to implement tic-tac-toe hello-world MCTS game player but I encountered a problem.

While simulating the game and choosing "the most promising" (exploit/explore) node I only take total wins number into account ("exploit" part) - this causes certain problem, the resulting algorithm is not defensive at all. As a result when choosing between

move that results in (100 draws; 10 loses)
move that results in (1 wins; 109 loses)

the worse one is chosen (1; 109) because my uct function greedily counts avg wins instead of "value".

Am I identyfing this problem correctly? Should I switch from "avg wins" to some other value metric that takes all results types into account ?

Any advice is welcome, thanks

Monte Carlo Tree Search - "most promising" move function

Answers (1)

Related Questions

Monte Carlo Tree Search - &quot;most promising&quot; move function

Answers (1)

Related Questions

Monte Carlo Tree Search - "most promising" move function