Kaleba KB Keitshokile
Kaleba KB Keitshokile

Reputation: 1995

Tic Tac Toe Min Max Algorithm Python. Computer Algorithm is not Robust

Play feel free to ask for clarity as this is a rather long question

I have a Tic Tac Toe ( X and O) game that works well when played by 2 human players.

My current problem is while trying to implement a Min Max Algorithm to it. I have 4 Functions that form the brain of the computer, which all have a short description above them. The First two functions are mainly for context.

I think the main problem is either in how the score is calculated or when choosing a game, many games have the same score and it chooses the first one with the highest score, which may not necessarily be the best choice.

Here is the code.

# Find All Open Positions On the Current Board
def posis(self):
    return [x for x, coordinate  in enumerate(self.gameBoard) if ((coordinate != 'X') and (coordinate != 'O'))]
# Return All Possible Combinations Given The Current Board State           
def pos_moves(self):
    return [arr for arr in permutations(self.posis())]
# The Computer Plays Every Possible Game Based on self.pos_moves()
# At The End It Returns Each Game and Its Score

def min_max(self,player):
    moves = self.pos_moves()
    pos_boards, scores = [],[]
    for move in moves:
        board = self.gameBoard[:] # This is just the current game board
        depth = 0
        function_player = player

        while True: # This is the loop that each possible game (from pos_moves) goes through for evaluation. at the end giving it a score. 
            board[move[depth]] = function_player
            if ((self.win_checker(board)==True) or (self.board_full(board) != False)):
                if self.win_checker(board) == True:
                    if function_player == player: # Meaning the winner of that game is the computer
                        score = 6-(depth + 1)
                    elif function_player != player: # maening the winner of that game is the human player
                        score = -6+(depth + 1)
                else: # If the game is a draw
                    score = 0
                pos_boards.append(move) # Adding the board to pos_boards
                scores.append(score) # Adding the score to scores with the same index as it's corresponding board from above
                break
            function_player = self.change_player(function_player)
            depth+=1       
    return (pos_boards,scores)
#I Think This Is Where The Problem Is
def comp_think(self,player):
    pos_boards = self.min_max(player)[0]
    scores = self.min_max(player)[1]
    play = pos_boards[scores.index(max(scores))] # this is a supposed to be the best path for the computer to choose. But it's not.
    print(play)
    return str(play[0]) # returning the first move in the best path for the computer to play

Here is a sample game:

0 | 1 | 2 
 --------- 
 3 | 4 | 5 
 --------- 
 6 | 7 | 8 


X, Chose a slot: 0 # I play top left

 X | 1 | 2 
 --------- 
 3 | 4 | 5 
 --------- 
 6 | 7 | 8 

(1, 2, 4, 3, 7, 5, 6, 8) #Computer plays top middle but I would like it to play middle middle

 X | O | 2 
 --------- 
 3 | 4 | 5 
 --------- 
 6 | 7 | 8 


X, Chose a slot: 4

 X | O | 2 
 --------- 
 3 | X | 5 
 --------- 
 6 | 7 | 8 

(2, 3, 5, 7, 8, 6)

 X | O | O 
 --------- 
 3 | X | 5 
 --------- 
 6 | 7 | 8 


X, Chose a slot: 8

 X | O | O 
 --------- 
 3 | X | 5 
 --------- 
 6 | 7 | X 

X Wins!!!

X in this case is the human player

Each tuple is the variable play from comp_think(). The first move is the one the computer plays and the second would be the human player. The entire tuple represent the computers idea of the best possible outcome for itself.

Upvotes: 0

Views: 191

Answers (1)

Theo
Theo

Reputation: 623

One potential error I see here is that the computer picks the move that could lead to the best outcome. For example, if it's possible for it to win in only 3 moves it will take that choice. However, that assumes that the player would make certain moves that allow the computer to win, which are probably unlikely. Perhaps, instead of finding the single move that could lead to the highest score, you add up all scores that could emerge from each move and the highest total is the choice it chooses? After all, notice that the tuple it gives is quite unlikely to happen - the player just ignoring the path it would be logical to take.

Also, you could give the depth of the game a bit less weight in determining score, if it even matters at all - why does it benefit the bot to win earlier? A win is a win.

Upvotes: 1

Related Questions