【人工智能】UC Berkeley 2021春季 CS188 Project 2: Multi-Agent Search Pacman吃豆人游戏

2023年7月28日上午8:17 • 人工智能 • 阅读 87

Project 2: Multi-Agent

Introduction
*
题目介绍
文件介绍
说在前面
Problem2:Minimax极小极大
Problem3:Alpha-Beta剪枝
Problem4:Expectimax
Problem5:Evaluation Function评价函数
所学感悟
参考链接

Introduction

题目介绍

本题目来源于UC Berkeley 2021春季 CS188 Artificial Intelligence Project2上的内容，项目具体介绍链接点击此处：UC Berkeley Spring 2021Project 2: Multi-Agent Search

文件介绍

【人工智能】UC Berkeley 2021春季 CS188 Project 2: Multi-Agent Search Pacman吃豆人游戏

; 说在前面

本项目只完成了project2中problem2-problem5部分，若有需要查看problem1部分，请移步至其他技术博客。

项目开发环境：Python3.9+VS Code

若电脑上未安装Python环境或者所安装的Python环境版本较低，可以在cmd中运行python查看情况。
若未安装，则cmd会跳转至Python安装界面：

安装后的cmd如图所示：

Problem2:Minimax极小极大

Minimax极小极大算法是 对抗搜索中的重要搜索方法，根据题目要求，我们可以通过迭代的方式找出失败的的最大可能性中的最小值。
在本题中，由于需要AI去击败玩家Player，所以我们 考虑Max作为AI的最大利益，Min作为Player的最小利益。
我们能够将如下的伪代码转换成我们需要的代码。值得一提的是，在2021年春季的project2中，将之前的 generatSuccessor函数改为了 getNextState函数，含义更具体与清晰。

class MinimaxAgent(MultiAgentSearchAgent):
"""
    Your minimax agent (question 2)
"""

    def getAction(self, gameState):
"""
        Returns the minimax action from the current gameState using self.depth
        and self.evaluationFunction.

        Here are some method calls that might be useful when implementing minimax.

        gameState.getLegalActions(agentIndex):
        Returns a list of legal actions for an agent
        agentIndex=0 means Pacman, ghosts are >= 1

        gameState.getNextState(agentIndex, action):
        Returns the child game state after an agent takes an action

        gameState.getNumAgents():
        Returns the total number of agents in the game

        gameState.isWin():
        Returns whether or not the game state is a winning state

        gameState.isLose():
        Returns whether or not the game state is a losing state
"""
        "*** YOUR CODE HERE ***"

        #Step 1.to realize get 'Min-Value'part
        def min_value(gameState,depth,agentIndex):
            #Initialize v= positively infinity
            v=float('inf')
            #if current state is gonna to stop"
            if gameState.isWin() or gameState.isLose():
                #it returns a number, where higher numbers are better
                return self.evaluationFunction(gameState)
            #for each successor of state:get min-value
            for legalAction in gameState.getLegalActions(agentIndex):
                if agentIndex==gameState.getNumAgents()-1:
                    #v=min(v,max_value(successor))
                    v=min(v,max_value(gameState.getNextState(agentIndex,legalAction),depth))
                else:
                    #go on searching the next ghost
                    v=min(v,min_value(gameState.getNextState(agentIndex,legalAction),depth,agentIndex+1))
            return v
        #Step 2.to realize get 'Max-Value'part
        def max_value(gameState,depth):
            #Initialize v= negatively infinity
            v=float('-inf')
             #while the first step is the top point,go on and depth add 1
            depth=depth+1
            #if current state is gonna to stop
            if depth==self.depth or gameState.isLose() or gameState.isWin():
                return self.evaluationFunction(gameState)
            #for each successor of state:get max-value
            for legalAction in gameState.getLegalActions(0):
                #v=max(v,min_value(successor))
                v=max(v,min_value(gameState.getNextState(0,legalAction),depth,1))
            return v

        nextAction=gameState.getLegalActions(0)
        Max=float('-inf')
        Result=None

        for nAction in nextAction:
            if(nAction!="stop"):
                depth=0
                value=min_value(gameState.getNextState(0,nAction),depth,1)
                if(value>Max):
                    Max=value
                    Result=nAction
        return Result

运行及截图：

python autograder.py -q q2 --no-graphics

Problem3:Alpha-Beta剪枝

在Minimax极小极大算法中有重复计算的部分，所以要进行剪枝。
Alpha-Beta剪枝用于裁剪搜索树中不需要搜索的树枝，来提升运算效率。
在Alpha-Beta部分我的做法不同于Minimax，在此部分进行了一定的改善。比如在比较过程中加入了函数getValue()来简化计算过程。最大的改善之处主要max-value和min-value中需要加入参数alpha&beta，并通过比较来返回值v。

class AlphaBetaAgent(MultiAgentSearchAgent):
"""
    Your minimax agent with alpha-beta pruning (question 3)
"""
    def getAction(self, gameState):
"""
        Returns the minimax action using self.depth and self.evaluationFunction
"""
        "*** YOUR CODE HERE ***"
        alpha = float('-inf')
        beta = float('inf')
        v = float('-inf')
        bestAction = None
        for legalAction in gameState.getLegalActions(0):
            value = self.getValue(gameState.getNextState(0, legalAction),1,0,alpha,beta)
            if value is not None and value>v:
                v = value
                bestAction = legalAction
            #update new alpha
            alpha=max(alpha,v)
        return bestAction

    def getValue(self, gameState, agentIndex, depth, alpha, beta):
        legalActions = gameState.getLegalActions(agentIndex)
        if len(legalActions)==0:
            return self.evaluationFunction(gameState)
        #according to the value of agentIndex,gain the function of next state
        #to assure the next is player or ghost
        if agentIndex==0:
            #cross 1 time
            depth=depth+1
            if depth == self.depth:
                return self.evaluationFunction(gameState)
            else:
                return self.max_value(gameState, agentIndex, depth, alpha, beta)
        elif agentIndex>0:
            return self.min_value(gameState, agentIndex, depth, alpha, beta)

    def max_value(self, gameState, agentIndex, depth, alpha, beta):
        #Initialize v= negatively infinity
        v = float('-inf')
        #for each successor of state:get max-value
        for legalAction in gameState.getLegalActions(agentIndex):
            value = self.getValue(gameState.getNextState(agentIndex, legalAction),
                (agentIndex+1)%gameState.getNumAgents(), depth, alpha, beta)
            #this condition is similar with problem2 ' s condition
            if value is not None and value > v:
                v=value
            #if v>beta return v
            if v>beta:
                return v
            #update new alpha
            alpha=max(alpha,v)
        return v

    def min_value(self, gameState, agentIndex, depth, alpha, beta):
        #Initialize v= positively infinity
        v = float('inf')
        #for each successor of state:get min-value
        for legalAction in gameState.getLegalActions(agentIndex):
            value = self.getValue(gameState.getNextState(agentIndex, legalAction),
                (agentIndex+1)%gameState.getNumAgents(), depth, alpha, beta)
            if value is not None and value < v:
                v=value
            #if v
            if v<alpha:
                return v
            #update new beta
            beta=min(beta,v)
        return v

运行及截图：

python autograder.py -q q3 --no-graphics

Problem4:Expectimax

Expectimax算法本质上就是每次进行 期望值的计算，然后再选取最大值，不断递归。首先从吃豆人开始，遍历所有可行的下一步，取效用最佳的action为bestAction。
和Alpha-Beta剪枝相比，本算法出现了exp-value，即计算期望值。通过对合法动作遍历，轮询所有的下一个状态，取所有评价值的平均值作为计算结果。

class ExpectimaxAgent(MultiAgentSearchAgent):
"""
      Your expectimax agent (question 4)
"""
    def getAction(self, gameState):
"""
        Returns the expectimax action using self.depth and self.evaluationFunction

        All ghosts should be modeled as choosing uniformly at random from their
        legal moves.

"""
        "*** YOUR CODE HERE ***"
        maxVal = float('-inf')
        bestAction = None
        for action in gameState.getLegalActions(agentIndex=0):
            value = self.getValue(gameState.getNextState(agentIndex=0, action=action), agentIndex=1, depth=0)
            if value is not None and value>maxVal:
                maxVal = value
                bestAction = action
        return bestAction

    def getValue(self, gameState, agentIndex, depth):
        legalActions = gameState.getLegalActions(agentIndex)
        if len(legalActions)==0:
            return self.evaluationFunction(gameState)
        if agentIndex==0:
            depth += 1
            if depth == self.depth:
                return self.evaluationFunction(gameState)
            else:
                return self.max_value(gameState, agentIndex, depth)
        elif agentIndex>0:
            return self.exp_value(gameState, agentIndex, depth)

    def max_value(self, gameState, agentIndex, depth):
        maxVal = -float('inf')
        legalActions = gameState.getLegalActions(agentIndex)
        for action in legalActions:
            value = self.getValue(gameState.getNextState(agentIndex, action), (agentIndex+1)%gameState.getNumAgents(), depth)
            if value is not None and value > maxVal:
                maxVal = value
        return maxVal

    def exp_value(self, gameState, agentIndex, depth):
        legalActions = gameState.getLegalActions(agentIndex)
        total = 0
        for action in legalActions:
            value = self.getValue(gameState.getNextState(agentIndex, action), (agentIndex+1)%gameState.getNumAgents(), depth)
            if value is not None:
                total += value
        return total/(len(legalActions))

运行及截图：

python autograder.py -q q4

“你现在应该在与鬼魂的近距离观察中观察到一种更加傲慢的方法。尤其是，如果吃豆子意识到自己可能被困住了，但可能会逃跑去抓几块食物，他至少会尝试一下。调查这两种情况的结果：”

python pacman.py -p AlphaBetaAgent -l trappedClassic -a depth=3 -q -n 10

python pacman.py -p ExpectimaxAgent -l trappedClassic -a depth=3 -q -n 10

和预想中的相同，ExpectimaxAgent赢了大约一半的时间，而AlphaBetaAgent总是输。

Problem5:Evaluation Function评价函数

此题要求对Reflex Agent的代码进行改进，特别注意函数的参数发生了变化，此时我们只能观察到当前的状态，而无法得知下一个状态的信息。我们可以通过计算total这个值来表示 鬼怪保持可以被吃掉状态的剩余时间，由于通过吃掉地图上的大豆豆可以得到这个正反馈，所以吃豆人会考虑吃掉附近的大豆豆。最后把 计算好的各种启发值加在游戏得分上，并返回。

def betterEvaluationFunction(currentGameState):
"""
    Your extreme ghost-hunting, pellet-nabbing, food-gobbling, unstoppable
    evaluation function (question 5).

    DESCRIPTION: <write something here so we know what you did>
"""
    "*** YOUR CODE HERE ***"
    #we only observe the current state and dont know the next state
    #initialize information we could use
    Pos = currentGameState.getPacmanPosition() #current position
    Food = currentGameState.getFood()          #current food
    GhostStates = currentGameState.getGhostStates()     #ghost state
    ScaredTimes = [ghostState.scaredTimer for ghostState in GhostStates]
    #find food and calculate positive reflection
    if len(Food.asList())>0:
        nearestFood = (min([manhattanDistance(Pos, food) for food in Food.asList()]))
        foodScore = 9/nearestFood
    else:
        foodScore = 0
    #find ghost and calculate negative reflection
    nearestGhost = min([manhattanDistance(Pos,ghostState.configuration.pos) for ghostState in GhostStates])
    dangerScore = -10/nearestGhost if nearestGhost!=0 else 0
    #the rest time of ghost
    totalScaredTimes = sum(ScaredTimes)
    #return sum of all value
    return currentGameState.getScore() + foodScore + dangerScore + totalScaredTimes

Abbreviation
better = betterEvaluationFunction

运行及截图：

python autograder.py -q q5 --no-graphics

所学感悟

本次实验中，我能够在理解他人代码以及所学理论知识的基础上，对本游戏进行完成及实现，同时，也增加了我对对抗搜索知识体系的巩固。

参考链接

1、【人工智能导论】吃豆人游戏（上）：对抗搜索与Minimax算法
2、敲代码学人工智能：对抗搜索问题
3、算法学习：Pac-Man的简单对抗
4、Berkeley Intro to AI学习笔记（一）MultiSearch
5、解析吃豆人游戏

Original: https://blog.csdn.net/weixin_45942927/article/details/120315999
Author: 夹小汁
Title: 【人工智能】UC Berkeley 2021春季 CS188 Project 2: Multi-Agent Search Pacman吃豆人游戏

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/720098/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

指纹图谱相似度评价软件_指纹图谱高效液相色谱

指纹图谱(制备、分析、比较、评价、校验) “准确的辨认”、”相似”、”宏观规律性” 安全性、有效性、稳定性、…

人工智能 2023年6月1日
0069
SQL数据分析常用案例总结

row_number() over() datediff()函数求最近一次的xx，sql模板，例如获取cookie最近一次访问日期 — step01 先&…

人工智能 2023年6月11日
0078
深度学习10——卷积神经网络

目录 1.全连接网络复习 2.卷积 2.1 卷积核 2.2 卷积层的基本实现 2.3 padding填充 2.4 stride步长 2.5 池化层 3. CNN实例 3.1 完整…

人工智能 2023年6月16日
00115
数据增强(Data Augmentation)

目录前言 1. Data Augmentation * 1.1 数据增强的作用 1.2 图像增强小工具 2. 数据增强的手段 3. 数据增强的代码 * 3.1 代码 3.2 结果…

人工智能 2023年5月26日
0073
Java——面向对象初阶

1.什么是面向对象面向对象程序设计（Object-Oriented Programming，OOP）是当今的主流程序设计范型，在面向对象世界里，一切皆为对象，面向对象是解决问题的…

人工智能 2023年6月28日
0073
CVPR2021论文解读–图像去噪:Pseudo 3D Auto-Correlation Network for Real Image Denoising

论文链接：https://openaccess.thecvf.com/content/CVPR2021/papers/Hu_Pseudo_3D_Auto-Correlation_N…

人工智能 2023年6月22日
0084
【opencv】opencv在windows和linux的应用

Opencv opencv是一个开源的图形图像处理工具，具有对图像进行数学建模，对其进行一系列的处理，为图像的识别和训练做准备。1、使用（1）在相应的平台用对应的工具编译openc…

人工智能 2023年7月20日
0053
Pytorch中inplace操作

文章目录前言 Inplace操作概述 inplace操作的优缺点常见的inplace操作总结参考链接前言之前在写训练代码时，遇到了inplace operation导致…

人工智能 2023年7月23日
0071
Python xx直聘 | 数据分析师岗位 | 分析可视化

关注微信公共号：小程在线关注CSDN博客：程志伟的博客 import numpy as npimport pandas as pdfrom pyecharts.charts im…

人工智能 2023年7月9日
0062
Linux基本命令(3)

Linux基本命令(3) 📟作者主页：慢热的陕西人🌴专栏链接：Linux📣欢迎各位大佬👍点赞🔥关注🚓收藏，🍉留言本博客主要讲解了最后一部分常用的Linux指令和一些热键，另外还介…

人工智能 2023年7月30日
0076
ImageNet数据集 & 下载

文章目录 1. ImageNet 说明 2. ILSVRC2012 说明 3. ImageNet下载方式 4. ImageNet数据组织与使用 ImageNet 说明 ImageN…

人工智能 2023年6月15日
00128
注意力机制模块

1.SENet SENet为通道注意力机制模块实现方式： 1.首先对输入进来的特征层进行一个全局池化，将【b,c,h,w】 -> 【b,c,1,1】 2.对全局池化后的特…

人工智能 2023年7月27日
0051
如何在部署AI算法时保障数据的隐私和安全性

部署AI算法时保障数据的隐私和安全性介绍在部署AI算法时，保障数据的隐私和安全性是一个至关重要的问题。确保数据的隐私和安全性可以避免敏感信息被泄露，防止数据被滥用。本文将介绍一…

人工智能 2024年1月4日
0049
时间序列分析之：傅里叶变换找周期

时间序列分析万万没想到吧，信号处理的技术，能用在数据分析中。谁叫我是学通信出生的呢？承接上一篇：函数分解本节承接上文找函数的周期。文章目录时间序列分析傅里叶变换一、傅里…

人工智能 2023年6月11日
0099
深度学习之图像分类（十八）– Vision Transformer(ViT)网络详解

深度学习之图像分类（十八）Vision Transformer(ViT)网络详解目录 * – 深度学习之图像分类（十八）Vision Transformer(ViT)…

人工智能 2023年6月16日
00137
三分钟解决Matlab中文乱码问题

前言：简单介绍Matlab中文乱码处理方式。 ; 乱码原因比如：教研室电脑上的是Matlab 2022a，个人笔记本上的是Matlab 2018b，用个人笔记本电脑打开教研室上电…

人工智能 2023年6月17日
0093

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

【人工智能】UC Berkeley 2021春季 CS188 Project 2: Multi-Agent Search Pacman吃豆人游戏

Project 2: Multi-Agent

题目介绍

文件介绍

; 说在前面

大家都在看