Advertisement

Tic-Tac-Toe using NegaMax

Started by April 12, 2010 06:12 PM
6 comments, last by Narf the Mouse 14 years, 5 months ago
MinMax/NegaMax sounded like a useful thing to learn, so I've written up a Tic-Tac-Toe implementation. The problem is, it's not picking the best move. So, I've decided to ask for help. I suspect the problem is that it doesn't find the quickest path to victory, only *A* path to victory. The NegaMax implementation:

    static class NegaMax
    {
        public static Int32[] sign = new int[2] { 1, -1 };   //0 is blue, 1 is red


        public static void Analyze(BoardTicTacToe board, Int32 depth, Int32 color, out MoveTicTacToe m)
        {
            List<Tuple2<Int32, MoveTicTacToe>> moves = new List<Tuple2<int, MoveTicTacToe>>();
            BoardTicTacToe c;
            foreach (MoveTicTacToe m2 in board.LegalMoves)
            {
                c = new BoardTicTacToe(board);
                c.MakeMove(m2, sign);
                moves.Add(<span class="cpp-keyword">new</span> Tuple2&lt;<span class="cpp-keyword">int</span>, MoveTicTacToe&gt;(-Analyze(c, depth + <span class="cpp-number">1</span>, <span class="cpp-number">1</span> - color), m2));  <span class="cpp-comment">//Note the "-" before "NegaMax"</span>
            }

            moves.Sort((a, b) =&gt;
            {
                <span class="cpp-keyword">if</span> (a.Item1 * sign &gt; b.Item1 * sign)
                    <span class="cpp-keyword">return</span> -<span class="cpp-number">1</span>;
                <span class="cpp-keyword">else</span> <span class="cpp-keyword">if</span> (a.Item1 * sign &lt; b.Item1 * sign)
                    <span class="cpp-keyword">return</span> <span class="cpp-number">1</span>;
                <span class="cpp-keyword">else</span> <span class="cpp-keyword">return</span> <span class="cpp-number">0</span>;
            }
            );
            m = moves[<span class="cpp-number">0</span>].Item2;
        }


        <span class="cpp-keyword">public</span> <span class="cpp-keyword">static</span> Int32 Analyze(BoardTicTacToe b, Int32 depth, Int32 color)
        {
            <span class="cpp-keyword">if</span> (b.GameOver())
            {
                Int32 result = sign * b.Analyze();
                <span class="cpp-comment">/* if (result &gt; 0)
                    result -= depth;
                else if (result &lt; 0)
                    result -= depth;
                return result; */</span>
            }
            Int32 max = -Int32.MaxValue;
            BoardTicTacToe c;
            foreach (MoveTicTacToe m in b.LegalMoves)
            {
                c = <span class="cpp-keyword">new</span> BoardTicTacToe(b);
                c.MakeMove(m, sign);
                Int32 x = -Analyze(c, depth + <span class="cpp-number">1</span>, <span class="cpp-number">1</span> - color);  <span class="cpp-comment">//Note the "-" before "NegaMax"</span>
                <span class="cpp-keyword">if</span> (x &gt; max) 
                    max = x;
            }
            <span class="cpp-keyword">return</span> max;
        }
    }

</pre></div><!–ENDSCRIPT–> 
Negamax (and all other minimax variants) will not prefer a short victory to any victory unless you do something like assign short victories larger values.

What's an example where you think it's picking the wrong move?
Advertisement
Quite simply, it'll sometimes ignore an "instant" victory move or let the other "player" get a win.
Well, then you have a bug. The good news is that tic-tac-toe is a very small game and it should be easy to find the problem by using a debugging and/or printing out a few intermediate values. Good luck.

The only mistake I can see in a quick glance through your code is that you commented out a useful return statement when the game is over.
I guess your negamax implementation is wrongly done.
I think you have to give the first call as
Analyze(...)
{
...
...
int score=-Negamax(.....and the parameters.)
}

For quick wining than longer wining, make use of returning higher values for quicker win
hint, use variable depth in this case.
Thanks for the help.

I found a bug - It wasn't correctly calculating in .GameOver() I also edited that return back in. In seems to be playing better, but still makes mistakes.

Further updates as I find further problems.
    static class NegaMax    {        public static Int32[] sign = new int[2] { 1, -1 };   //0 is blue, 1 is red        public static void Analyze(BoardTicTacToe board, Int32 color, out MoveTicTacToe m)        {            List<Tuple2<Result, MoveTicTacToe>> moves = new List<Tuple2<Result, MoveTicTacToe>>();            BoardTicTacToe c;            Int32 depth = 0;            foreach (MoveTicTacToe m2 in board.LegalMoves)            {                c = new BoardTicTacToe(board);                c.MakeMove(m2, sign);<br>                moves.Add(<span class="cpp-keyword">new</span> Tuple2&lt;Result, MoveTicTacToe&gt;(-Analyze(c, depth + <span class="cpp-number">1</span>, <span class="cpp-number">1</span> - color), m2));  <span class="cpp-comment">//Note the "-" before "NegaMax"</span><br>            }<br><br>            moves.Sort((a, b) =&gt;<br>            {<br>                <span class="cpp-keyword">if</span> (a.Item1.Score * sign &gt; b.Item1.Score * sign)<br>                    <span class="cpp-keyword">return</span> -<span class="cpp-number">1</span>;<br>                <span class="cpp-keyword">else</span> <span class="cpp-keyword">if</span> (a.Item1.Score * sign &lt; b.Item1.Score * sign)<br>                    <span class="cpp-keyword">return</span> <span class="cpp-number">1</span>;<br>                <span class="cpp-keyword">else</span><br>                    <span class="cpp-keyword">if</span> (a.Item1.Score * sign == b.Item1.Score * sign<br>                        &amp;&amp; a.Item1.Depth &lt; b.Item1.Depth)<br>                        <span class="cpp-keyword">return</span> -<span class="cpp-number">1</span>;<br>                    <span class="cpp-keyword">else</span> <span class="cpp-keyword">if</span> (a.Item1.Score * sign == b.Item1.Score * sign<br>                            &amp;&amp; a.Item1.Depth &gt; b.Item1.Depth)<br>                        <span class="cpp-keyword">return</span> <span class="cpp-number">1</span>;<br>                    <span class="cpp-keyword">else</span><br>                        <span class="cpp-keyword">return</span> <span class="cpp-number">0</span>;<br>                        <br>            }<br>            );<br>            m = moves[<span class="cpp-number">0</span>].Item2;<br>        }<br><br><br>        <span class="cpp-keyword">public</span> <span class="cpp-keyword">struct</span> Result<br>        {<br>            #region Body<br><br>            <span class="cpp-keyword">public</span> Result(Int32 score, Int32 depth)<br>            {<br>                <span class="cpp-keyword">this</span>.Score = score;<br>                <span class="cpp-keyword">this</span>.Depth = depth;<br>            }<br><br><br>            <span class="cpp-keyword">public</span> Int32 Score;<br>            <span class="cpp-keyword">public</span> Int32 Depth;<br><br><br>            <span class="cpp-keyword">public</span> <span class="cpp-keyword">static</span> Result <span class="cpp-keyword">operator</span> -(Result a)<br>            {<br>                a.Score = -a.Score;<br>                <span class="cpp-keyword">return</span> a;<br>            }<br><br><br>            <span class="cpp-keyword">public</span> override string ToString()<br>            {<br>                <span class="cpp-keyword">return</span> <span class="cpp-literal">"Score: "</span> + Score + <span class="cpp-literal">", Depth: "</span> + Depth;<br>            }<br><br>            #endregion<br>        }<br><br><br>        <span class="cpp-keyword">public</span> <span class="cpp-keyword">static</span> Result Analyze(BoardTicTacToe b, Int32 depth, Int32 color)<br>        {<br>            <span class="cpp-keyword">if</span> (b.GameOver())<br>            {<br>                Int32 result = sign * b.Analyze();<br>                <span class="cpp-comment">/* if (result &gt; 0)<br>                    result -= depth;<br>                else if (result &lt; 0)<br>                    result -= depth;<br>                return result; */</span><br>                <span class="cpp-keyword">return</span> <span class="cpp-keyword">new</span> Result(result, depth);<br>            }<br>            Result max = <span class="cpp-keyword">new</span> Result(-Int32.MaxValue, Int32.MaxValue);<br>            BoardTicTacToe c;<br>            foreach (MoveTicTacToe m in b.LegalMoves)<br>            {<br>                c = <span class="cpp-keyword">new</span> BoardTicTacToe(b);<br>                c.MakeMove(m, sign);<br>                Result x = -Analyze(c, depth + <span class="cpp-number">1</span>, <span class="cpp-number">1</span> - color);  <span class="cpp-comment">//Note the "-" before "NegaMax"</span><br>                <span class="cpp-keyword">if</span> (x.Score &gt; max.Score<br>                    || (x.Score == max.Score &amp;&amp; x.Depth &lt; max.Depth)) <br>                    max = x;<br>            }<br>            <span class="cpp-keyword">return</span> max;<br>        }<br>    }<br><br></pre></div><!–ENDSCRIPT–> 
Advertisement
[font=arial,helvetica,sans-serif]I hacked together something in C++ that does the right thing. It's sort of a fun implementation.
#include <iostream>
#include <ctime>
#include <cstdlib>
#include <algorithm>

struct Board {
static unsigned const code[9];

unsigned player_sum[2];
int board[9];

Board() {
player_sum[0]=0x11111111;
player_sum[1]=0x11111111;
for (int i=0; i<9; ++i)
board=-1;
}

void play(int player, int square) {
player_sum[player]+=code[square];
board[square]=player;
}

void undo(int player, int square) {
player_sum[player]-=code[square];
board[square]=-1;
}

bool finished(int &winner) {
if (player_sum[0]&0x44444444){
winner=0;
return true;
}
if (player_sum[1]&0x44444444){
winner=1;
return true;
}
if (player_sum[0]+player_sum[1] == 0x55555555) {
winner=-1;
return true;
}
return false;
}
};

unsigned const Board::code[9] = {
0x10010010, 0x10001000, 0x10000101,
0x01010000, 0x01001011, 0x01000100,
0x00110001, 0x00101000, 0x00100110
};

int nega_max(Board &b, int player, int alpha, int beta, int from_root) {
int winner;

if (b.finished(winner)) {
if (winner==-1)
return -500+(std::rand()%1001);
return (winner == player ? 1 : -1)
*(1000000000 - from_root*1000 -500 + (std::rand()%1001));
}

for (int i=0; i<9; ++i) {
if (b.board==-1) {
b.play(player, i);
int v = -nega_max(b, 1-player, -beta, -alpha, from_root+1);
b.undo(player, i);
if (v>alpha) {
alpha = v;
if (v>beta) {
return v;
}
}
}
}

return alpha;
}

int pick_move(Board &b, int player) {
int best_score=-1000000000;
int best_move=0;

for (int i=0; i<9; ++i) {
if (b.board==-1) {
b.play(player, i);
int v = -nega_max(b, 1-player, -1000000000, -best_score, 1);
b.undo(player, i);
if (v>best_score) {
best_score=v;
best_move=i;
}
}
}
std::cout << "best=" << best_move << " score=" << best_score << '\n';
return best_move;
}

std::ostream &operator << (std::ostream &os, Board const &b) {
for (int j=0; j<3; ++j) {
for (int i=0; i<3; ++i)
os << ".xo"[b.board[3*j+i]+1] << ' ';
os << " ";
for (int i=0; i<3; ++i)
os << (3*j+i) << ' ';
os << '\n';
}
return os;
}

int main() {
std::srand(time(0));
Board b;
int winner;

int human_player;
std::cout << "Enter which player you want to be (0 or 1): ";
std::cin >> human_player;

for (int i=0; !b.finished(winner); i=1-i) {
std::cout << b;
int m;
if (i==human_player)
std::cin >> m;
else
m=pick_move(b,i);
b.play(i,m);
}

std::cout << b;

switch(winner) {
case -1:
std::cout << "It's a draw!\n";
break;
case 0:
std::cout << "`x' wins!\n";
break;
case 1:
std::cout << "`o' wins!\n";
break;
}
}
Thanks - I'll look it over.

This topic is closed to new replies.

Advertisement