2
An Artificial Intelligence Agent Playing Othello
Author’s Name
Institutional Affiliation
Course
Instructor
Date
Introduction
Discussing some aspects of the assignment specifications in attempting to understand a particular point (Othello gameplay Othello Online, 2021)
Othello is a game that we want to make an application for. “Iago vs. Othello” (a pun on the Shakespearean tragedy “Othello”) is the name of this Java-based program. Spectators can watch any implemented artificial intelligence face off against each other, or two humans may challenge each other in a two-player match. A human opponent’s next move can be calculated over time using “traditional” search methods like Minimax and Alpha-Beta Pruning. All algorithms may be used in conjunction with different strategies. The next step was to test and evaluate the performance of our agents using a variety of algorithms, search depths, optimizations, and testing methodologies that we had developed and implemented. The next part introduces the theory behind the algorithms, the heuristic functions employed, and concludes by discussing the results and future developments.
The Agent
In a game, an agent is a player who can assess the current state of play and choose from the available moves. An agent uses a specific algorithm and strategy to determine the best course of action.
The algorithm
For the most part, the algorithms used in this project revolve around Minimax as their foundation. Based on Von Neumann’s Game Theory, this program picks one of his current movements and assesses its ramifications, i.e., how the game state would have evolved due to that specific move if it had been made. The algorithm understands the opponent’s turn as the players exchange turns (unless an exceptional circumstance arises). Thus, it picks a move from the opponent and assesses its effects before moving on to the next round, where new activity is determined. Once the algorithm reaches the point where it can no longer continue, the game is ended. To find out what further could be possible, it goes one step backward and selects other moves from the sequence that got it to that point (Hernandez et al., 2019).
These evaluations continue until the best game-over condition has been selected as objective (i.e., a state that means victory for the agent, or if no success is reachable, a state implies draw). The agent may now pick the first move that moves the game toward its goal since it understands the chain of events leading to it. With this method, there are two fundamental concerns. Both player and opponent movements are considered while deciding on a sequence; the agent may forecast the move their opponent will make, but it’s only a guess.
The game will change if the opponent chooses a different choice. An optimum player (one who always strives to maximize their final points) works with the assumption that the opponent is an optimum player. So, the agent will select the opponent’s moves that reduce the agent’s score as the opponent’s moves (Hernandez et al., 2019). This is not a concern at all. Non-optimum actions can only lead to lower opponent scores and higher agent scores, so the player will get an unexpectedly high score if the opponent doesn’t make the best possible choices. This is because if there were a better strategy for minimizing the agent’s score, the opponent would have chosen it. The second difficulty is more challenging to deal with. Let’s pretend for simplification that there are ten possible movements at each phase, each of which might lead to a distinct game state of ten different kinds. Assuming all the potential 2-move sequences, there are 100 states. There are a total of 1,000 possible 3-move lines, and so on. Algorithm depth increases the number of states to assess exponentially. It has been predicted that for Reversi, with a maximum of 60 turns, there are roughly 1058 possible states.
Discuss a problem you encountered while implementing the alpha-beta pruning algorithm as described in the course text
If a prospective move is always worse than a previously examined one, it ends the assessment process.
There is still too much computational power required.
The computational time necessary to analyze a single step would be more than the duration of the universe, making it impossible to explore so many states with the computers we have now. Our goal is to have agents calculate moves in an acceptable amount of time for a human opponent, such as a few seconds. In addition to Minimax, we’ve incorporated Alpha-Beta Pruning, a well-known method. When it determines that a potential move is always worse than another one previously assessed, it stops evaluating it like Minimax. Since fewer movements and states are considered using less time and memory, Alpha-Beta Pruning is a suitable optimization of Minimax. Even with Alpha-Beta Pruning, computation time is still excessive.
Consequently, the search is halted at a specific depth. Because terminal states (states assessed at their maximum depth) aren’t game-ending, the agent has no idea whether they’re good or terrible in this form. These states must be evaluated by providing the agents with information that may be utilized to identify the best course of action in a given game scenario. There are a variety of techniques and playing styles based on distinct “heuristics.”
Heuristics (An analysis of heuristics in Othello, 2021)
Discussing High-Level Features of Your Heuristic
Heuristic on Score (HS)
Reversi’s most basic heuristic function determines a player’s score by counting the number of disks of their color presently on the game table. The player with the most disks of their color will have the most disks in play to win the game. The score throughout the game might fluctuate dramatically from turn to turn. Thus, having more disks in a specific game condition does not always imply an advantage. However, this heuristic is valuable for comparing the behavior of more complicated heuristics to that of a “beginning” player.
Heuristic on Mobility (HM)
Using this heuristic, we don’t take into account the player’s total score; instead, we take into account some variables:
The player’s mobility: the number of movements they can make;
The number of unoccupied spaces adjacent to at least one disk belonging to the opponent and the player’s mobility:
At least one player’s disk must have one vacant space adjacent to it for the opponent to move.
Players who cannot make legal movements (i.e., who have zero mobility) must pass their turn, resulting in a significant penalty and an enormous advantage for their opponents, who will almost certainly win the game. Although this method is superior to HS in estimating utility, it is still far from ideal since mobility decreases for both players as the game progresses.
Heuristic on Mobility and Corners (HMC)
Reversi’s corner-capture approach is a crucial one. Those in charge of these roles have a significant impact. By definition, any neighboring disc in a corner is also stable, and the same holds for every disk in a corner. In addition to being mobile, a decent heuristic function must consider the corners and places close to them. So, we give a lot of weight to corners since they’re more vital than maintaining high mobility. We penalize positions that offer the opponent access to the corners since they are undesirable situations.
Heuristic on Mobility, Corners, and Edges (HMCE)
This heuristic function considers the player’s capacity for discs on both edges and subtracts the number of opponent’s disks from this value. Corner locations are vital, but positions along the edges are also significant. The agent uses this heuristic to play more on the edges and prevent the opponent from taking advantage.
Heuristic on Mobility, Corners, Edges, and Stability (HMCES)
If a player has more disks around the edges than their opponent (with corners), HMCE awards a positive score, or it awards a negative score. Even if the disks at the margins aren’t sturdy, their strategic significance is enhanced if the opponent can’t seize them. When the following conditions are met, an edge disk is considered stable:
One of the four corners of the room is designated for it.
One or more disks of the same color are put in a row or column with at least one ending in one of the four corners.
The stability of central disks is less relevant and relies on edge stability (due to the recursive nature of stability definition). Therefore, we only evaluated edge stability, which is simpler to calculate.
Heuristic on Mobility, Corners, Edges and Stability, Time-variant (HMCEST)
Several tactics might be used based on the situation as the game progresses. There are 60 turns in a game of Reversi. As a result, we’ve decided to weigh the value of flexibility and steadiness concerning the existing turn of the game when evaluating the results. Calculating the mobility score and the stability score independently, then multiplying those values by weights and summing them up with the corner score is how this heuristic is implemented (which is not weighted). T is the current turn number; the weights for mobility and stability are derived as follows: wm and ws
wm = 1.5 and was = 0.5 if T ≤ 20;
wm = 1.5 and was = 0.5 if 20 ≤ T ≤ 40;
wm = 1.5 and was = 0.5 if T ≥ 40.
Some of the new features of Othello is; X Squares and also Corners. Discs that have been played in the corners cannot be flipped.
Othello code information
You will use an alpha-beta minimax search with an evaluation function and depth cutoff to develop a computer player for the game of Othello. Turning on the functionalities, you need to do is:
(Create-minimax-player eval-fn depth-cutoff)
“Strategy” (to be discussed below) should be created that will call your alpha-beta minimax search function with the relevant evaluation function, depth cutoff, and other information.
(eval-fn board max-player)
Returns a figure indicating the likelihood of Max winning a game given aboard. ‘X’ or ‘O’ will be the maximum player. This function must evaluate 8 by 8 Othello boards.
Getting started
Othello versus a random strategy is the best way to get started. In Othello.SCM, run the following phrases to see how they hold.
(Define board-size 6)
(Play-Othello human-player random-player)
A global variable controls the board size in the first expression. Its normal size is 8, but I recommend reducing it here to complete a game in a shorter amount of time.
In the second statement, a game of Othello is being played. Two “strategy functions” are sent to the play-Othello function. There is always a first-time player who plays X, and this person is always the first to play.
C-c C-c in Edwin/emacs is all you need to exit the game. Type C-g if you’re using a shell to execute it.
Strategy functions
There are two parameters to a strategy function: a board and a player (either “X” or “O”). It produces a list of two integers, one for each x and y coordinate of the motion.
Print out the board, identify the player and then asks for a move before returning it to the human player using the human-player strategy function. If your move isn’t legitimate, the human-player function will prompt you to make a new one.
When given a list of possible moves, the random-player strategy function picks one at random, then returns it.
Here’s what you’ll be writing in your function:
(Create-minimax-strategy eval-fn depth-cutoff)
Returns a strategy function. Here are some of the reasons to do the project in this manner:
Alpha-beta minimax search function parameters will depend on what information you choose to keep at a node in your search tree and how that information is represented
As a result, I have more control over the search’s depth when I test your routines.
The Othello.SCM file has an example create-minimax-strategy function that is commented out. Your strategy function is invoked with the current board state as the parameter when you make a move. The root node of your search for the optimum move at this moment in the game should match this state.
Tips for writing an alpha-beta MINIMAX search
This issue should be solved in writing before implementing it in Scheme. Alpha-beta search may be implemented. In contrast, your alpha-beta search method must inform you which move from the present board state (the root node of the current search) resulted in the ideal play, but the algorithms in the text yield the value of the tree.
The line “if alpha >= beta then returns alpha” for the max-player and “if beta = alpha then returns beta” for the mini-player should be kept in mind.
Certain functions perform max, min, >=, and = with real numbers extended to infinity for your convenience. For your alpha-beta search, see the section on functions.
Alpha-beta search requires a different set of data than the data you put at nodes.
Tips for writing an evaluation function
Othello is an excellent game against the random strategy (or another person)! Edges and corners of the board, for example, are more essential than other board areas.
NOTE: All code testing will be done on 8 by 8 boards. However, building and debugging your code may be simpler if you design an evaluation function that works on 6 by 6 boards since it takes less time to play a 6-by-6 game.
Function for testing your code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;
; (print-board board)
;
; prints an ASCII representation of the board to the screen
;
Conclusions and Future Development
The developed algorithms, despite their relative simplicity, demonstrate some form of intelligent behavior; examples of them include:
The opponent is forced to pick weak moves, which the agent may profit from;
However, in situations where it knows it can take over a corner later because the position isn’t threatened, the agent chooses to concentrate on other strategic locations before attempting to take over a corner.
A few rounds later, the agent’s plan reveals that specific movements seem to be idiotic at first (such as going near the middle of the table instead of the edge).
Our program might be further enhanced by creating and testing new algorithms based on search tree exploration. The issue could be approached inversely by building an agent that can be taught to play via collected game involvement.
References
An analysis of heuristics in Othello. (n.d.). Retrieved November 19, 2021, from https://courses.cs.washington.edu/courses/cse573/04au/Project/mini1/RUSSIA/Final_Pap er.pdf.
Hernandez, J., Daza, K., & Florez, H. (2019). Alpha-beta vs. scout algorithms for the Othello game. In CEUR Workshops Proceedings (Vol. 2846).
Othello gameplay Othello Online. Othello. (n.d.). Retrieved November 19, 2021, from https://www.othelloonline.org/.
The post 2 An Artificial Intelligence Agent Playing Othello Author’s Name Institutional Affiliation Course appeared first on PapersSpot.