(所有代码都在附件中)1、囚徒博弈简介
两个共谋嫌疑犯作案后被警察抓住,分别关在不同的屋子里接受审讯。警察知道两人有罪,但缺乏足够的证据。警察告诉每个人:如果两人都抵赖,各判刑一年;如果两人都坦白,各判八年;如果两人中一个坦白而另一个抵赖,坦白的放出去,抵赖的判十年。于是,每个囚徒都面临两种选择:坦白或抵赖。然而,不管同伙选择什么,每个囚徒的最优选择是坦白:如果同伙抵赖、自己坦白的话放出去,抵赖的话判一年,坦白比不坦白好;如果同伙坦白、自己坦白的话判八年,比起抵赖的判十年,坦白还是比抵赖的好。(来自度娘,仅为科普,了解的可以跳过)
判刑是负收益,换成更易理解的得分:(A,B)表示A与B在各自选择下的得分
双方都合作各得一分,都叛变则都是零分······
对于多次博弈,怎样才能规避双输,或者说,使自己拿到更多的分呢?分享一个python代码实现,youtube链接:https://youtu.be/pMHOqotUiP8
附件是所有用到的py代码。无外部引用库,所有工具函数由自己编写,看完这个小项目,除了让你了解如何用python实现,也可以不用把所有代码挤在一个py文件中。
2.包的介绍
下载解压之后直接运行main.py,选1是测试某个固定策略对所有策略的得分,选2则是与某一固定策略对弈。这里以选1的titforTat为例。首先与其它策略进行20次重复博弈,得出均分。然后“if everybody was doing it”,与自己博弈,求得均分。选1时调用AISimulation.py,传入其它策略进行博弈;选2时调用AhumanGame.py.下面是main.py的代码import AIsimulation,AhumanGame,alwaysCollude,alwaysDefect,titForTat,randomBasic,randomColluding,randomDefecting,grudger,pavlov,Sanjin,myStrategy,titfor2Tat
#引入记分、模拟与运转策略的库
choices = ['1-alwaysCollude','2-alwaysDefect','3-titForTat','4-randomBasic','5-randomColluding','6-randomDefecting','7-grudger','8-pavlov','9-Sanjin','10-myStrategy','11-titfor2Tat']
#选择策略,列表
strategies = {1:alwaysCollude,2:alwaysDefect,3:titForTat,4:randomBasic,5:randomColluding,6:randomDefecting,7:grudger,8:pavlov,9:Sanjin,10:myStrategy,11:titfor2Tat}
#字典
print('Here are your game options')
print('press 1 to test your AI strategy against all other AI strategies')
print('press 2 to play against an AI strategy of your choice ')
choice = int(input())
if choice == 1:
print('here are the strategies, choose one')
print(choices) #展示可供选择的策略
num = int(input('choose a strategy via number'))
strategy = strategies[num]
AIsimulation.testStrategy(strategy,20) #AIsimulation中的函数testStrategy调用
if choice == 2:
print('who do you want to play against')
print(choices)
num = int(input('choose a strategy via number'))
strategy = strategies[num]
rounds = int(input('how many rounds do you want to play:'))
AhumanGame.play(strategy,rounds)
3.工具函数
你可能以为描写策略的工具函数十分复杂,可实际上它可能简单到只有一个函数。以最经典的titforTat(以牙还牙)为例,记录对手的历史并重复其上一次的行为。If opponentMove==’start’,即初始选择,为1(collude).接着再return opponentMove[-1].下方是这个工具函数的代码
def play(opponentMove):
if opponentMove == 'start':
return 1
opponentHistory = []
opponentHistory.append(opponentMove)
if opponentHistory:
return opponentHistory[-1]
else:
return 1
def name():
return 'titForTat'
值得一提的是有一个mystrategy.py,你可以编写自己的策略,与其它策略博弈。如:
def play(opponentMove):
if opponentMove == 'start':
return 1
opponentHistory = []
opponentHistory.append(opponentMove)
average = sum(opponentHistory)/len(opponentHistory)
if average >0.7:
return 1
else:
return 0
def name():
return 'myStrategy'
代码并不难懂,祝大家玩得开心! |