The AI in The Octagon Theory-Developing
Developing the AI
Q: How did you approach the development of the AI. Which resources did you use while you where researching your AI and where did you get your inspiration from.
DH: Back in the late 80s when I developed the AI for TOT there was no Internet and I didn't know much about AI (even now I don't). So my inspiration came out of the early Apple II games and what game logic I could find in the few English-language personal computer books and magazines I could get my hands on here in Tokyo. The actual ideas I had about AI came out of simple programing and what one could do with it - mostly comparison, and searching for the highest value in an array.
My actual approach went like this: If I were the computer, where is the best place I'd place a piece? It was just common sense to place a piece where it could kill the highest number of opponent pieces and/or move the opponent pieces into weaker positions - weaker positions being those that are closer to the board's edge or to the center 'black hole'. Also TOT had only one piece - the 8-piece that attacks opponent pieces in all eight directions. And this 8-piece was unlimited in number and required no rotation computation. So the AI was a lot less complicated than the current AI (more about this below).
Q: You might have tried a brute force solution. This naturally leads to the question of the size of the search space. How big is it and is it possible to brute force it in a minmax approach? Did you think about pruning the search space?
DH: I guess I do use a brute force approach. The AI scans every open position on the board and evaluates how much damage it can do from that position, keeps a record of it and moves to the next open position, does the same evaluation. If the AI can do more damage from this new position than it could have done from the previous position, this new position is marked as the potential position to move to. This is repeated until all open positions have been evaluated. Currently I only do a single breadth search. I don't do any 'what if' scenarios, or do any depth comparisons of potential moves to see what consequences a move will have a few turns down the road. The AI is still very basic. But it is a bit surprising at how effective it is. But I still wonder how effective it'd be against a very serious player of perfect information board games. I have never gotten to the point of pruning the search space because my AI is still too simple to require it.
Q: Do you have an evaluation function/weighting scheme and which aspects factor in?
DH: Lots of evaluation and weighting: For example. Aggressive vs defensive play. The AI can be adjusted to get more aggressive (or defensive) at any point in the game. But right now this is a one time only switch. Each position on the board is weighted. Safer positions are given higher values. This value is added to the value computed for each open position depending on how many 'kills' (push opponent pieces off the board) and/or how many 'pushes' (cause opponent pieces to move to a less safe position) will occur if a piece is placed in the candidate position.
Description of evaluation function/weighting scheme
Scan all positions from top-left to bottom-right.
If the first board position isn't open move to the next position until an open position is found.
Get board weight for this position then scan in each of the eight compass directions (N, NE, E, SE, S, SW, W, NW) as described in step 4 and 5.
In the north compass direction look for an opponent piece. If an opponent piece is found and it can be 'killed' (knocked off the board) then add to the board weight the 'Normal Kill' value (e.g 34), If the piece can't be killed but can be 'pushed' to a position with a lower relative (to current value) weighting add +5, if the piece will be pushed to a position with a higher relative weighting add -5. The total of the additions becomes the temporary kill value for this direction.
Repeat for the rest of the compass directions. Add all the temporary kill values to come up with the total kill value for this position and store the value and the position location for later comparison.
If this total kill value is equal to or higher than the previous kill value, replace the previous total kill value and board position location with the new total kill value and position location. If this total kill value is equal to the previous kill value build a list of equal kill values. The kill value will later be selected randomly from this list.
Go back to step one and repeat for all open board positions. When all positions have been tried, the one with the highest total kill value is selected (or selected randomly from the equal kill value list if it exists), and its board position location is passed to the client which handles the killing/pushing of the opponent pieces. The client then passes the new board state to the AI.
When the game gets close to the end, the 'Normal Kill' value of 34 can be increased to an arbitrary 'End Game Kill' value, say 68 (doubled) to boost the end game aggressiveness of the AI. Also taken into consideration are how many of the limited pieces are left, which rotation of the pieces will result in a higher kill value, avoiding moving to the same position more than once or twice in successive turns. etc.
Q: Which heuristics did you use in your AI and which heuristics did you find while you where playing the game yourself.
DH: No heuristics or learning algorithms are used. One thing I learned while playing, that the AI doesn't consider, is strategically blocking opponents' moves. Until recently I only considered moves that could do damage, and if no damage could be done, to place a piece in the safest possible position - far from the edges or center hole or the board. I soon discovered a 3rd way of playing - place pieces in THE positions from which the opponent could do a lot of damage. (This is something I want to put in an updated AI.)
Q: Do you have different difficulty settings for your AI - if so how did you achieve the different behaviors?
DH: I have difficulty levels. But instead of choosing a difficulty, you chose an AI player that is rated as aggressive or defensive with a strength rating on a 1 to 10 scale. My strongest is only rated at 4 as I expect I can make the AI a lot stronger using 'real' AI techniques. And I expect other people, like the people who visit AiGameDev.com, can do a lot better than I can. The weakest AI players use an earlier version of the AI that in its implementation is just not a good as the newer version. On top of that I use weightings to adjust if the AI will chose a safe position that does little damage over a position that does lots of damage. A random factor is also used to decide which it will chose. Also adjusting the weighting the AI gives to kills makes it weaker/stronger. How early and when to use the successively stronger pieces is also taken into consideration.
Q: How many different versions of your AI did you develop? What did not work that you thought would work - were there any surprises?
DH: Two major, and many minor versions. The 1st major version was the original AI for the Apple II+ TOT back in the late 80s. It only had to consider one piece to use (the 8-piece) to attack the opponent's pieces. This version was done completely in Applesoft Basic. And there were two surprises:
It beat me the 1st time (and every time) I played. I could never beat it which I thought was cool, but weird since I was effectively playing against my own thought patterns.
The computer would take about two minutes to make a move. Way too slooooow.
This led to the 2nd minor version: I kept the game in Applesoft Basic but I redid the complete AI part in Motorola 6502 assembly language. I had to order a book from the USA and teach myself assembly language. There was one surprise:
It was FAST! Way too fast!. As soon as the human player (me) would make a move. BAM! the AI would make its move. I'd make my next move. then BAM! the computer would make its next move. This was also cool but it didn't make for good game play - no suspense over what the computer would do. So I simply introduced a loop that'd wait for a random number of seconds before the computer would make its move.
The 2nd major version is the current one in which the AI has to consider which piece to use, not only the 8-piece from the original TOT, but also the new 4-piece, 2-piece, and 1-piece. And the rotation of these new pieces. And in addition the AI has to consider that only the 1-piece is unlimited in number.
There are a few of minor versions in the game that just make use of weighting to change the 'personalities' and strength of each AI player. And there are a few unreleased minor versions that dealt mostly with deciding which piece to use while taking into consideration how many of each piece was left. They are unreleased because I didn't think they were very effective.
You must Sign up as a member of Effecthub to view the content.
A PHP Error was encountered
Severity: Notice
Message: Undefined index: HTTP_ACCEPT_LANGUAGE
Filename: helpers/time_helper.php
Line Number: 22
1704 views 0 comments
You must Sign up as a member of Effecthub to join the conversation.