Unity 机器学习代理工具包 (ML-Agents) 是一个开源项目,它使游戏和模拟能够作为训练智能代理的环境。 unity官方提供基于PyTorch的强化学习算法的实现,使游戏开发人员和爱好者能够轻松地为 2D、3D 和 VR/AR 游戏训练智能代理。 研究人员还可以使用提供的简单易用的 Python API 来训练使用强化学习、模仿学习、神经进化或任何其他方法的代理。
![]()
本文主要围绕官方提供的 Getting Started 文档介绍环境搭建及API的使用,并补充部分我实际使用的代码。
安装环境
Installation是实际可使用的文档,建议按照这个文档给出的最小安装版本进行环境搭建。这个文档的最后给出了Next Step指导如何使用官方预训练模型运行,但是我们的关注点在于如何使用Python Api开发我们自己的网络模型,因此可以基于这个进行环境有效性验证,但是对于 mlagents-learn 这一部分不会深入(因为这个限制比较多只适合需要快速构建模型的开发者使用,不适合研究者和希望使用自己的模型的用户)。
项目导入
其实使用Unity Editor打开项目(根目录下的Project文件夹)就可以,当然更推荐按照官方教程的sample包导入方式。
编写Python脚本
文档比较详细,下面给出一段我的demo代码:
"""
file_name is the name of the environment binary (located in the root directory of the python project).
worker_id indicates which port to use for communication with the environment.
For use in parallel training regimes such as A3C.
seed indicates the seed to use when generating random numbers during the training process.
In environments which are deterministic, setting the seed enables reproducible experimentation by ensuring
that the environment and trainers utilize the same random seed.
side_channels provides a way to exchange data with the Unity simulation
that is not related to the reinforcementlearning loop.
For example: configurations or properties.More on them in the Modifying the environment from Python section.
If you want to directly interact with the Editor, you need to use file_name=None,
then press the Play button in the Editor when the message
"Start training by pressing the Play button in the Unity Editor" is displayed on the screen
"""
env = UnityEnvironment(file_name=None, seed=1, side_channels=[UnityStaticLogChannel()])
config_channel.set_configuration_parameters(time_scale=1.0)
env.reset()
"""
Returns a Mapping of BehaviorName to BehaviorSpec objects (read only).
A BehaviorSpec contains the observation shapes and the ActionSpec (which defines the action shape).
Note that the BehaviorSpec for a specific group is fixed throughout the simulation.
The number of entries in the Mapping can change over time in the simulation if new Agent behaviors
are created in the simulation.
An Agent "Behavior" is a group of Agents identified by a BehaviorName that share the same observations
and action types (described in their BehaviorSpec).
"""
behavior_names = env.behavior_specs.keys()
for i in behavior_names:
print("[Info] Behavior Name: ", i.title())
count = 0
while True:
if count > 5000:
break
for name in behavior_names:
states = env.get_steps(name)
actions = ActionTuple()
ac = np.random.randint(0, 5, size=16).reshape(-1, 1)
actions.add_discrete(ac)
env.set_actions(name, actions)
env.step()
count += 1
env.close()
UnityStaticLogChannel
类是用于unity前端和python算法端旁路通信的类,文档在Custom-SideChannels,这里给出一个示例:
class UnityStaticLogChannel(SideChannel):
def __init__(self) -> None:
super().__init__(uuid.UUID("a1d8f7b7-cec8-50f9-b78b-d3e165a78520"))
def on_message_received(self, msg: IncomingMessage) -> None:
"""
Note: We must implement this method of the SideChannel interface to
receive messages from Unity
"""
print(msg.read_string())
def send_string(self, data: str) -> None:
msg = OutgoingMessage()
msg.write_string(data)
super().queue_message_to_send(msg)
运行比较简单,只需要先启动python脚本然后点击editor里对应的游戏场景RUN就可以了。
Original: https://blog.csdn.net/vic_torsun/article/details/119849763
Author: 丧心病狂の程序员
Title: [Unity与强化学习] ML-Agents Python Api 环境配置与开发
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/780283/
转载文章受原作者版权保护。转载请注明原作者出处!