GNOME Bugzilla – Bug 735412
[PATCH] Correction to the main loop of the AI.
Last modified: 2014-08-27 15:18:59 UTC
Created attachment 284452 [details] [review] Correction to the main loop of the AI. There’s (at least) one bug in the main loop on the AI. Since commit 769c6cb1576cd707e51278065ca2c9be481d3bd5 the program stops estimating the choices of move if he falls on a finish of the game, even if it means he loses the game. It should instead continue to search for a better solution… Here is a patch that (I think, needs verification) should correct the bug. It only tests during the search of moves if it’s a perfect game-end, and handle the other cases at the beginning of the function (more usual in reversi’s algorithms).
Review of attachment 284452 [details] [review]: It looks good to me. Let's just play a few games on -f fast mode to make sure it doesn't start crashing or anything, since this code is really hard to edit without breaking things. ::: src/computer-player.vala @@ +127,2 @@ + /* if it's a perfect game-end, don't try to find better */ + if (g.count_tiles (Player.flip_color (g.current_color)) == 0) I think this whole branch becomes a small optimization, correct? I'd rather remove it to simplify the code.
Created attachment 284457 [details] [review] Correction to the main loop of the AI. That’s in fact a big optimization, and I think it’s quite clear as it’s logical to the “if win” code to be there. Here is a patch that places all the cases where it wins at the same place, is it better?
Review of attachment 284457 [details] [review]: ::: src/computer-player.vala @@ +101,3 @@ + /* An end of the game where he loses */ + if (g.is_complete ()) + return p > 0 ? int.MIN + g.count_tiles (Player.flip_color (g.current_color)) : int.MAX - g.count_tiles (g.current_color); This does not look easier to read than the previous patch. :p I'd rather this be written out with if statements. @@ +120,1 @@ if (g.is_complete ()) I was hoping I'd convinced you to get rid of this condition, but I'll accept it either way.
(In reply to comment #2) > That’s in fact a big optimization, and I think it’s quite clear as it’s logical > to the “if win” code to be there. Here is a patch that places all the cases > where it wins at the same place, is it better? Well I'm still not really convinced. The moves are already ordered such that the moves are expanded in order of how many pieces you won, so a move that results in you winning is probably going to be one of the first nodes expanded. And I don't think speed is really a consideration anyway, since it moves so fast that we have to inject artificial delay.
Created attachment 284550 [details] [review] Correction to the main loop of the AI. That’s not so important for now, here is a patch that puts all in the beginning of the function.
Created attachment 284551 [details] [review] Correction to the main loop of the AI. Oops, wrong patch.
Created attachment 284553 [details] [review] Use a negamax notation. Here is a second patch that makes use of the negamax notation. One little note: I think the commit 5375dd4ce7254a2c5b2063d1e4eae473242e160a introduced a regression in the algorithm (accepting moves with the same score but ordered after in the list, so with less tiles taken…), so while rewriting, I’ve done it another way.
Created attachment 284556 [details] [review] Comment for when there's no move. Here is a little clarification…
Review of attachment 284551 [details] [review]: Super, thanks! But please add a paragraph to the commit message to explain what's going on here. Thanks! ::: src/computer-player.vala @@ +162,3 @@ int next_x_move = 0, next_y_move = 0; var a_new = search (g, strategy, depth - 1, a, b, -p, ref next_x_move, ref next_y_move); + if (a_new > a) This change looks right, but is it related to this commit?
Review of attachment 284556 [details] [review]: Yup.
Review of attachment 284551 [details] [review]: ::: src/computer-player.vala @@ +162,3 @@ int next_x_move = 0, next_y_move = 0; var a_new = search (g, strategy, depth - 1, a, b, -p, ref next_x_move, ref next_y_move); + if (a_new > a) Damn good thing I left this comment: 152 /* 153 * We have to update the principle variant if the new alpha/beta is 154 * equal to the previous one, since we use (0, 0) as our default 155 * move, and a search could otherwise select it even if invalid 156 * when the search reaches the end of the game. 157 */ It doesn't look like anything has changed to make this safe; can you explain? (And if it is safe now, you should remove that comment too.)
Review of attachment 284553 [details] [review]: Yay, this looks simpler. There's probably some horrible bug in this commit, but if so I couldn't find it, so looking good. ::: src/computer-player.vala @@ +90,3 @@ * strategy. */ int x = 0, y = 0; + search (new Game.copy (game), strategy, depth, -10000, 10000, ref x, ref y); What's this change for? I think int.MIN and int.MAX is more clear; is there a reason we can't use them anymore? If so, please define these as static constants. @@ +98,2 @@ { /* End of the game, return a near-infinite evaluation */ This comment would need to be updated
Created attachment 284568 [details] [review] Correction to the main loop of the AI. The ‘oops wrong patch’ was already because of the ‘if (a_new > a)’ thing, looks like I forgot two times to format-patch without checking. ^^’ This patch should be the good (I hope). This ‘if (a_new > a)’ thing is what I explain briefly at comment 7, it prefers moves with less tiles taken (or whatever less optimal depending of what compare_move() does) if the score is the same, that should be corrected in the ‘negamax’ patch. The problem with int.MIN and int.MAX is that, for what I know, int.MIN ≠ - int.MAX; with the negamax algorithm, we invert many times these numbers. That’s why I choosed a big number (10000), that is near-infinite when you look at the other numbers we use. I’m going to make a patch with static constant for that.
Created attachment 284569 [details] [review] Use a negamax notation.
Review of attachment 284568 [details] [review]: We're going to break something. I just know it.
Review of attachment 284569 [details] [review]: Nits: ::: src/computer-player.vala @@ +33,3 @@ } + private static const int positive_infinity = 10000; Add a comment here, otherwise someone is going to change this to int.MAX and int.MIN. Likely me, after I forget why we did this. Oh and we usually use ALL_CAPS for our constants. @@ +137,3 @@ + * We use (0, 0) as our default move; if we don't change that, + * a search could select it even if invalid at the end of the game. + */ OK, this makes more sense to me now that everything's in the proper patch. I think it's right. It looks right. Make sure all the AI tests pass, if you haven't already (run 'make check'). If you could move the comment into the else block instead of separating the if from the else, that would be more typical/readable. @@ +155,2 @@ + int next_x_move = 0, next_y_move = 0; + var a_new = - search (g, strategy, depth - 1, -b, -a, ref next_x_move, ref next_y_move); For clarity, I'd write -1 * search (...
(In reply to comment #13) > This ‘if (a_new > a)’ thing is what I explain briefly at comment 7, it prefers > moves with less tiles taken (or whatever less optimal depending of what > compare_move() does) if the score is the same, that should be corrected in the > ‘negamax’ patch. The number of tiles on the board is just for ordering the moves to maximize pruning. If two moves have the same heuristic value, and one results in more tiles, they're still equally good (one has tiles in more advantageous positions, closer to the edge of the board). So we can pick either one. So the <= is not necessary and you'd normally use a < instead, but <= doesn't harm anything at all, it just slows things down very very slightly. But < is slightly more clear to read, too, so that's the form I'd prefer. <= it was just a hack to get around the crash which I *think* you patch will avoid. It's really hard to reason about this code, though; we might discover otherwise the hard way. :( > The problem with int.MIN and int.MAX is that, for what I know, int.MIN ≠ - > int.MAX; with the negamax algorithm, we invert many times these numbers. That’s > why I choosed a big number (10000), that is near-infinite when you look at the > other numbers we use. I’m going to make a patch with static constant for that. Ooooh good point.
Created attachment 284577 [details] [review] Use a negamax notation. Looks like `make check` says all is OK. > If two moves have the same heuristic value, and one results in more > tiles, they're still equally good (one has tiles in more advantageous > positions, closer to the edge of the board). So we can pick either one. That’s (at least) theorically correct.
Created attachment 284578 [details] [review] Comment for when there's no move.
Created attachment 284580 [details] [review] Use a negamax notation. I finally understood a little thing that didn’t make sense for me in Iagno’s AI code. Here’s a correction of my ‘negamax’ patch.
Review of attachment 284580 [details] [review]: Hoping for the best. I'm clearly very optimistic about this. :p
Review of attachment 284578 [details] [review]: OK
Attachment 284568 [details] pushed as 270e0d1 - Correction to the main loop of the AI. Attachment 284578 [details] pushed as 26fbd47 - Comment for when there's no move. Attachment 284580 [details] pushed as aa969f3 - Use a negamax notation.
Thanks!