SOLUTION: COMP3160 ARTIFICIAL INTELLIGENCE – My Assignment Tutor

COMP3160 ARTIFICIAL INTELLIGENCEAssignment 2 (Weight: 20%)Evolutionary Algorithms for Adversarial Game PlayingDraft Submission Due: 11:55pm, Oct 29, 2021 (Friday, Week 12)Final Submission Due: 11:55pm, Nov 05, 2021 (Friday, Week 13)The goal of this assignment is to appreciate the efficacy of Evolutionary Algorithms, specifically Genetic Algorithm (GA), in the context of game theory. You will be using the DEAPpackage for Genetic Algorithm in order to evolve strategies for repeatedly playing variantsof The Tragedy of the Commons (TC). Basically, there is a group of farmers (herdsman)in a village, and there is a community grazing area (“Commons”). If the farmers limit thenumber of cows that each of them is allowed to graze in the Commons, everyone prospers.If a few farmers flout the convention, they benefit at some cost to others. But if too manyfarmers flout the convention, then everyone suffers due to over-grazing. This game is quitepertinent to the sustainability issue. It is interesting in this context to read the obituary, Elinor Ostrom: An uncommon woman for the commons, that Kenneth J Arrow and colleagueswrote for an unusual political scientist who received Nobel Prize in economics.The Tragedy of the Commons is originally described by Hardin1 as follows:As a rationale being, each herdsman seeks to maximize his gain. Explicitly or implicitly, more or less consciously, he asks, “what is the utility to me of adding one moreanimal to my herd?” This utility has a negative and a positive component.1. The positive component is a function of the increment of one animal. Since theherdsman receives all the proceeds from the sale of the additional animal, thepositive utility is nearly +1.2. The negative component is a function of the additional overgrazing created byone more animal. Since, however, the effects of overgrazing are shared by all[. . . ], the negative utility for any particular decision-making herdsman is only afraction of 1.Adding together the component particular utilities, the rational herdsman concludesthat the only sensible course for him to pursue is to add another animal to his herd.And another; and another… But this is the conclusion reached by each and everyrational herdsman sharing a commons. Therein is the tragedy. Each man is lockedinto a system that compels him to increase his herd without limit – in a world that islimited. Ruin is the destination towards which all men rush, each pursuing his ownbest interest in a society that believes in the freedom of the commons. Freedom in acommons brings ruin to all.We will look at two very simplistic models of this imagined scenario. Both of them aretwo-player games.In the first model (TC1), we assume a small grazing common land that can comfortablysupport two cows altogether, and each cow brings 5 units of utility to its owner. If one more1Hardin, G. “The Tragedy of the Commons.” Science 1968, 162, 1243–1248.1cow is allowed to graze, the cows are not adequately fed, and the benefit from each cowreduces by 1. If instead two more cows are allowed in, the feed is fully inadequate for everycow because of overgrazing, and they bring zero benefit each to their owners. It is assumedthat there is a tacit understanding between the two farmers that they will cooperate, that is,will allow in only one cow each. The payoff matrix for this model is given as follows:Table 1: The Tragedy of the Commons: TC1ROWC DCOLUMN C (5, 5)(4, 8)(8, 4)(0, 0) D Payoffs to: (Column, Row)The second model (TC2) changes the narrative to some extent. The local council decides that the two farmers can take a lease of part of the “commons” for a nominal fee,and use that part exclusively for grazing their own cows. However there is a constraint:each can take an independent decision of whether to lease 1 3 of the commons, or 1 4. Eitherway, part of the commons will be left un-leased, and they will have equal grazing right overthat un-leased part. Here we assume that the total commons has an area of 24, and eachunit area provides one unit of benefit. Again there is the tacit understanding that the twofarmers will cooperate, that is each will take out a lease on 14 of the commons. Howeverthere is incentive to defect, as the payoff matrix shows below. For instance, if one farmercooperates, and the other defects, the cooperator receives 24 4 + 24-( 24 42 + 24 3 ) = 11 units ofbenefit, while the defector will receive 24 3 + 24-( 24 42 + 24 3 ) = 13 units.Table 2: The Tragedy of the Commons: TC2ROWC DCOLUMN C (12, 12)(11, 13)(13, 11)(12, 12) D Payoffs to: (Column, Row)2Task SpecificationThe main issue is the representation of a strategy for playing a game like TC1 or TC2 as abit string. Two papers are provided in the Assignment folder that will help you with this.You are recommended to start with the paper, an easy read, by Adnan Haider. You mightwant to also read the paper by Golbeck, if you find the topic interesting.Although it is possible to represent players/strategies in very many different ways, forparity in marking, you are strongly advised to employ the following representation:Figure 1: Representation of two players. With d as the memory depth, m = 22d. Totallength of a player/chromosome: 22d + 2d.1. BACKGROUND KNOWLEDGE ASSESSMENT [5 marks](a) Consider the Cryptocurrency Mining. You may want to read this New Yorkerarticle about the environmental impact of Bitcoin mining. Which of the twomodels of the Tragedy of the Commons, TC1 and TC2, is closer in spirit to thissituation? Answer this question in no more than 50 words.(b) Find all Dominant Strategy Equilibria, Nash Equilibria and Pareto OptimalStrategy Profiles for the game TC1.(c) Find all Dominant Strategy Equilibria, Nash Equilibria and Pareto OptimalStrategy Profiles for the game TC2.(d) Analyse the results you obtained in items 1b and 1c above. Describe any interesting difference not noted between them. Why did you find those differencesinteresting? Answer this question in no more than 50 words.(e) Axelrod’s tournaments of Iterated Prisoner’s Dilemma show that that co-operationamong players can emerge in a world of selfish agents. What do you expect would happen if different strategies were to play Iterated TC1 (henceforth(ITC1) instead? What would you expect in the case of Iterated TC2 (henceforth(ITC2)? Explain your answer in no more than 50 words. Give particular attention to any difference in what you woudl expect in these two cases.32. IMPLEMENTATION IN PYTHON [10 marks](a) Implement the function:payoff_to_player1(player1, player2, game):returns payoffNote: player2 is the adversary, and payoff is determined by latest movesobtained from respective appropriate memory locations and the payoff matrixfor the relevant game (TC1 or TC2). Assume memory-depth of 2.(b) Implement the function:next_move(player1, player2, game, round):returns player1’s moveNote: player1’s next move is based on both the players’ earlier moves andplayer1’s strategy (encoded in player1’s chromosome). The move to bereturned can be C/D or 0/1 depending on your representation. Note that inearly rounds some default moves are made. Assume memory-depth of 2.(c) Implement the function:process_move(player, move, memory_depth):returns success / failureNote: player’s relevant memory bits are updated based on its latest movemove and its memory depth.(d) Implement the function: score(player1, player2,returns score to player1m_depth, n_rounds, game): Note: player1 iteratively plays the game game against player2 for n roundsnumber of rounds, and its score is returned.(e) You will be using the DEAP package for genetic evolution of strategies. Assume a memory depth of 2.i. Implement genetic evolution of strategies for playing ITC1.ii. Modify the code you wrote as part of Task 2(e)i above so as to geneticallyevolve strategies for playing ITC2.43. ANALYSIS [5 marks](a) What is the best ITC1-strategy that you generated in Task 2(e)i via GA? Whatparameters (fitness function, type of crossover, mutation rate, etc.) did you use,and why?(b) Describe in English the behaviour of that ITC1-strategy (that you identifiedin Task 3a above). Does it align with the prediction you outlined in Task 1eearlier? Explain in no more than 50 words.(c) What is the best ITC2-strategy that you generated in Task 2(e)ii via GA? Whatparameters (fitness function, type of crossover, mutation rate, etc.) did you use,and why?(d) Describe in English the behaviour of the ITC2-strategy (that you identified inTask 3c above). Does it align with the prediction you outlined in Task 1e earlier? Explain in no more than 50 words.(e) Take an appropriate sustainability issue, and outline in no more than 50 wordsthe implications that your results have in that context.What to Submit, and WhenYou will submit two files:1. code.py2. report.pdf.Your code file should include all the Python codes you wrote for this assignment.Your report file should include all the answers (including the Python codes copied-andpasted). It must be submitted in the pdf format. It must have as cover page the one that hasbeen supplied (as part of the document template), duly filled and signed. Also, if relevant,note in the last section anything relevant that is worth noting.You will submit the files in two stages. In the first stage you must submit two draft files(that you will be able to update) by 11:55pm, Friday Week 12:(a) of the program file code.py including atleast the implementation of functions specified in Tasks 2a and 2b.(b) of your report file report.pdf with answers to Tasks 1a-1e. You can modify these files while preparing your final version. However, failure to submit the two draft files by the required date will attracta penalty of 2 marks. The final version of these three files must be submitted by11:55pm, Friday Week 13.5