A very interesting problem with the MiniMax algorithm. What could cause this behaviour?

Question

I have implemented the MiniMax algorithm (with alpha-beta pruning), however it behaves in an interesting way. My player will create a huge lead, but when it is time to make the final, winning move it does not take that move and just keeps dragging out the game.

Here is my minimax function:

// Game states are represented by Node objects (holds the move and the board in that state)
//ValueStep is just a pair holding the minimax value and a game move (step) 

private ValueStep minimax(Node gameState,int depth,int alpha,int beta) {

  //Node.MAXDEPTH is a constant
  if(depth == Node.MAXDEPTH || gameOver(gameState.board)) {
      return new ValueStep(gameState.heuristicValue(),gameState.step);
  }

  //this method definately works. child nodes are created with a move and an 
  //updated board and MAX value
  //which determines if they are the maximizing or minimizing players game states.
  gameState.children = gameState.findPossibleStates();

  if(state.MAX) { //maximizing player
      ValueStep best = null;

      for(Node child: gameState.children) {

          ValueStep vs = new ValueStep(minimax(child,depth+1,alpha,beta).value,child.move);

          //values updated here if needed
          if(best==null || vs.value > best.value) best = vs;

          if(vs.value > alpha) alpha = vs.value;

          if(alpha >= beta) break;
      }

      return best;

  } else { //minimizing player
      ValueStep best = null;

      for(Node child: gameState.children) {

          ValueStep vs = new ValueStep(minimax(child,depth+1,alfa,beta).value,child.move);

          if(best==null || vs.value < best.value) best = vs;

          if(vs.value < beta) beta = vs.value;

          if(alpha >= beta) break;
      }

      return best;
  }

}

First I thought the problem is with my evaluating function, but if it is, I couldn't find it. In this game both players have a score and my function simply calculates the heuristic value from the score difference. Here it is:

public int heuristicValue() {

       //I calculate the score difference here in this state and save it in 
       //the variable scoreDiff. scoreDiff will be positive if I am winning 
       //here, negative if im loosing.

        //"this" is a Node object here. If the game is over here, special
        //heuristic values are returned, depending on who wins (or if its a 
        //draw) 
        if(gameOver(this.board)) {
            if(scoreDiff>0) {
                return Integer.MAX_VALUE;  
            } else if(scoreDiff==0) {
                return 0;
            } else {
                return Integer.MIN_VALUE;
            }
        }

        int value = 0;
        value += 100*scoreDiff; //caluclate the heuristic value using the score differerence. If its high, the value will be high as well 

      return value;
  }

I have "translated" my code into english, so there may be typos. Im fairly sure the problem is here somewhere, but if you need some other code then I'll update the question. Again, my player can create an advantage, but it won't make the final winning move for some reason. I appreciate your help!

Dennis Soemers · Accepted Answer

Suppose that your Minimax player is in a position where it can prove that it can guarantee a win. There will often be many different ways in which it can still guarantee an eventual win. Some moves may be instant wins, some moves may drag the game out unnecessarily... as long as it's not a really stupid move that suddenly allows the opponent to win (or draw), they are all wins, and they all have the same game-theoretic value (Integer.MAX_VALUE in your code).

Your Minimax algorithm does not distinguish between those moves, and simply plays the one that happens to be the first in your gameState.children list. That may be a fast, shallow win, or it may be a slow, very deep win.

There are two easy ways to make your Minimax algorithm prioritize fast wins over slow wins:

The best option (because it also has many other advantages) is to use Iterative Deepening. You can look that up for details, but the basic idea is that you first do a Minimax search with a depth limit of 1, then another one with a depth limit of 2, then with a depth limit of 3, etc. As soon as one of your searches proves a win, you can terminate searching and play that winning move. This will make your algorithm always prefer the shortest wins (because those are found first).
Alternatively, you can modify your heuristicValue() function to incorporate the search depth. For example, you can return Integer.MAX_VALUE - depth in winning positions. That will make faster wins actually have a slightly larger evaluation.

A very interesting problem with the MiniMax algorithm. What could cause this behaviour?

Answers (1)

Related Questions