Reputation: 11
I'm implementing an AI controller for the 2-players version of the board game Quoridor in Python and i'm using the Alpha-beta pruning variation of the Negamax algorithm. I expected that growing the depth of the search would result in a better strategy adopted by the agent but this doesn't happen in my implementation. By passing from depth 3 to 4 (bigger depths have now unacceptable computational times) the agent goes from playing sufficiently good to play really awful matches. I'm pretty sure (according to literature found and the logic of the game) that the heuristic i'm using is effective and so i can't figure out what the problem is. This is my alpha-beta pruning implementation
#self._h_type: heuristic type used by the agent
#self._env: board in a given status
def alphaBeta(depth,alpha,beta):
if self._env.isTerminal() or not depth:
return (self._env.getHeuristic(self._h_type),None)
else:
max_value=(-math.inf,None)
for a in self._env.getActions():
self._env.update(a)
x=alphaBeta(depth-1,(-beta[0],beta[1]),(alpha[0],alpha[1]))
if -x[0]>max_value[0]:
max_value=(-x[0],a)
elif -x[0]==max_value[0] and uniform(0,1)>0.5:
max_value=(-x[0],a)
self._env.undo()
if max_value[0]>alpha[0]:alpha=max_value
if alpha[0]>=beta[0]: return alpha
return max_value
move=alphaBeta(self._depth,(-math.inf,None),(math.inf,None))
self._env.update(move[1])
Heuristic function computes the difference between the length of the shortest paths to he goal row of the two players (between the current turn's player and the opponent) which is a objective measure of the advantage in this game
#shortest path heuristic.
#+/- 1000 for terminal state
#difference of shortest paths length to the goal rows
def getSpHeuristic(self):
if self.isTerminal(): return self.getWmHeuristic()
else:
p1=self.getMovingPawn()
p2=self.getOpponentPawn()
len(self._graph.shortestPath(p2.getPosition(),p2.getGoalRow()))\
-len(self._graph.shortestPath(p1.getPosition(),p1.getGoalRow()))
Upvotes: 1
Views: 531
Reputation: 11
First, in terms of your implementation, a couple possible issues:
Second, you're actually asking the question you'll want to ask constantly for your agent: will the gain from adding more exploration exceed the loss from incurring additional computation/time.
Third, for adversarial search, specifically, the problem you're having is expected. At larger depths you can run out of time and have to return an answer before it's done traversing the tree. In such cases, the answers are usually worse than a completed answer from smaller depth.
There are all sorts of tricks to increase search while reducing compute. Alpha-beta pruning is one such method. So is negamax.
This site on chess-ai has a good collection of the various tricks one can employ in alpha-beta. Iterative deepening might be the next trick for you to try out.
Upvotes: 1