Prisoner's dilemma text

by Steven Flintham (15A) ------------------------ Imagine two menu sitting in a room facing each other across a table. Each has in his hand two cards, marked "Cooperate" and "Defect". Both lay one card face down on the table and after both have done so, a banker turns over the two cards and pays one or both of the players depending on which cards they have played. After an predetermined, but unknown to the players, number of goes the game ends and the players walk alway with their winnings. Payouts are made according to the following table: Key: P1 - player 1's move P2 - player 2's move C - cooperate D - defect P1 C D +-------------------+-------------------+ : REWARD FOR MUTUAL : TEMPTATION TO : : COOPERATION : DEFECT : C : : : : 3 TO BOTH : 5 TO P1 : : : : P2 +-------------------+-------------------+ : TEMPTATION TO : 'PENALTY' FOR : : DEFECT : MUTUAL DEFECTION : D : : : : 5 TO P2 : 1 TO BOTH : : : : +-------------------+-------------------+ So, for instance, if one player played "Defect" and the other played "Cooperate", the player who defected would receive 5. It is important to remember that neither player can see the other's card, since the cards are not turned over until both have played. The big question is - how should one play to obtain the largest possible amount of money? You might reason that if your opponent has played cooperate, your best move is to defect, since you will then receive 5 rather than both of you receiving 3. If your opponent has played defect, your best bet is still to defect, since you will both then receive 1 instead of your opponent receiving 5. So, using logic (?), you have deduced that you should always defect. However, your opponent can legitimately follow exactly the same reasoning and deduce that he/she should also defect all of the time. This leads to you both receiving 1 per go - so you are both worse off than you would be if you both cooperated all the time, which would yield 3 per go! It can be quite interesting to actually play this game with another human - it becomes something of an exercise in trust. At this point, it is worth explaining why it is important that the players do not know how many goes are left at any time. If the duration of the game is unknown, players will be more reluctant to defect on their opponent since they know that their opponent will have a chance at revenge. However, if the players know that it is the last go, there will be no such incentive since there will be no chance for revenge. This knowledge leads to similar behaviour on the second from last move, and therefore the third from last move, and so on, which is why the duration of the game must be kept secret. The computer program -------------------- The accompanying program Comp1 is a computer version of this which pits strategies implemented as BASIC functions against one another. A strategy is a piece of BASIC which, using the information available, must decide whether to lay down a cooperate or defect card. A tournament is held between all the strategies, and a ranking order is produced. To see how to implement a strategy, let's have a look some of the strategies in the program as supplied. Each strategy is preceded by a DATA statement which gives its full name and the name of the function which implements it. So, the Random strategy in the program has the following data line: DATA Random,random because it is called "Random" and is implemented by FNrandom. The function itself must return either a 1 to cooperate or a 2 to defect. These are held in the variables cooperate% and defect% for ease of use, but you can use the numbers if you want. The full listing of the random strategy is: DATA Random,random DEF FNrandom(discard%) =RND(2) Ignore the parameter for the moment. It can be seen that random simply returns either 1 or 2 without any consideration for previous moves, and is therefore in a sense a "non-strategy". A slightly more complex strategy is Tit for Tat: DATA Tit for Tat,tit_for_tat DEF FNtit_for_tat(subscript%) IF round%=1 THEN =cooperate% =move%(round%-1,subscript%) This illustrates several points. Firstly, the parameter passed to the function. This is the second subscript of the move%() array which holds the details of the previous moves by both programs. move%(1,X) holds the result of the first move by both programs - only AFTER the first move has been made, obviously. move%(1,subscript%) is the move which was made by the strategy's opponent on round 1. The current round number is held in the variable round%. Therefore, the Tit for Tat function will always cooperate on the first round. After that, it makes whatever move its opponent made on the previous round. A still more complex (although not necessarily better) strategy is Grudger: DATA Grudger,grudger DEF FNgrudger(subscript%) LOCAL opp_defect%,check% IF round%=1 THEN =cooperate% opp_defect%=FALSE FOR check%=1 TO round%-1 IF move%(check%,subscript%)=defect% THEN opp_defect%=TRUE NEXT IF opp_defect% THEN =defect% =cooperate% This, like Tit for Tat, always cooperates on the first round. After that, it checks all its opponents previous moves to see if it has ever defected. If it has, Grudger also defects. If its opponent has not yet defected, Grudger continues to cooperate. You should now be able to work out what the other strategies provided do. In order to help you to write your own, here are a few tips: Always provide some special action for round 1, when there is no information about your opponent's previous moves. If you want to, you can also look at your own past moves by using (3-subscript%) as the second array subscript. Even if you are called after your opponent, move%(round%,subscript%) will not yet contain your opponent's move, so you can't sneakily look at its choice! Note that in a tournament, all the programs play against each other, including themselves. This means that your strategy will end up playing against itself. Also because of the tournament system, note that how successful a strategy is is determined by how the other strategies behave. For instance, the Tit for Two Tats strategy in the program as supplied is very good in this "climate", but it might be unsuccessful if competing against different strategies. This is true of all tournaments, and is not a failing of the program. Functions must not create any global variables of their own. They can only read the ones provided - round% and move%(). However, you can create as many LOCAL variables as you like. The competition --------------- To enter the competition, you must submit ONE strategy which will be entered into a tournament with all the other entries. You need only submit the DATA line and the function definition itself - they will be appended onto the end of the routine by the competition organiser. Of course, you can write as many strategies as you like for your own testing purposes, but only ONE can be submitted. This is because it would otherwise be possible to submit two strategies, one of which was designed to be exploited by the other in order to increase its score. Although the number of rounds is currently 200, this does not mean that you can use 200-round% in your strategy to tell you how many rounds are left. In the final tournament, the number of rounds will be changed to prevent this kind of cheating! N.B. - the supplied strategies are based on descriptions of those written by several games theorists. I did not design them and they will not be entered into the competition tournament with the exception of Random. The first tournament of this kind was, I believe, held by a Dr/Professor (?) Axelrod, who I feel obliged to mention as the inventor of this kind of contest.