基于PPO算法的机器人轴孔装配控制与仿真
 申玉鑫,刘晓明,肖逸,余德平*
(四川大学 机械工程学院,四川 成都 610065)
摘要:针对在管道运输和航空航天领域常见的大口径轴孔装配任务,设计一种基于PPO算法的装配控制方法。首先,建立强化学习算法与装配环境交互训练框架,设计两个网络用于拟合装配策略和评估值函数;其次,设计机器人输出的动作空间与装配环境输出的状态空间,保证学习过程中的有效探索;然后,设计非线性奖励函数以确保训练过程的快速收敛;最后,搭建基于MuJoCo物理引擎的机器人大口径轴孔装配仿真平台,并在仿真平台上对设计算法进行训练和实验。结果表明:基于PPO算法的训练框架能保证训练过程的快速收敛,改进后的优势函数估计方法提升了训练过程的稳定性,训练模型不仅能保证轴插入孔和法兰面贴合,还能保证装配过程的安全性。
关键词:装配;PPO算法;MuJoCo仿真
中图分类号:TP249            文献标志码:A            doi:10.3969/j.issn.1006-0316.2023.12.012
文章编号:1006-0316 (2023) 12-0074-07
Robotic Peg-in-Hole Assembly Control and Simulation Based on PPO Algorithm
SHEN Yuxin,LIU Xiaoming,XIAO Yi,YU Deping
( School of Mechanical Engineering, Sichuan University, Chengdu 610065, China )
Abstract:A PPO algorithm-based assembly control method is proposed for the large-diameter peg-in-hole assembly which is common in pipeline transportation and aerospace fields. Firstly, the interactive training framework between the reinforcement learning algorithm and assembly environment is established, and two networks are designed to fit the assembly strategy and the evaluation value function respectively. Secondly, the action space of robot output and the state space of assembly environment output are designed to ensure the effective exploration in the learning process. Then, a nonlinear reward function is designed to ensure the fast and stable convergence of the training process. Finally, a simulation platform for robot assembly of large-diameter peg-in-hole assembly based on MuJoCo physics engine is built, and the designed algorithm is trained and tested on the simulation platform. The results show that the training framework based on PPO algorithm can ensure the fast convergence of the training process, and the improved dominance function estimation method can improve the stability of the training process. The training model can not only ensure the fit of the shaft insertion hole and the flange surface, but also ensure the safety of the assembly process.
Key words:assembly;PPO algorithm;MuJoCo simulation
———————————————
收稿日期:2023-07-16
作者简介:申玉鑫(1998-),男,四川遂宁人,硕士研究生,主要研究方向为机器人自动化,E-mail:shenyuxin2021@163.com。*通讯作者:余德平(1984-),男,江西抚州人,博士,教授,主要研究方向为智能与自动化装备,E-mail:williamydp@scu.edu.cn。



 

设为首页  |  加入收藏    |   免责条款
《机械》杂志版权所有     Copyright©2008-2012 Jixiezazhi.com All Rights Reserved 

  电话:028-85925070    传真:028-85925073    E-mail:jixie@vip.163.com

地址:四川省成都锦江工业开发区墨香路48号   邮编:610063

蜀ICP备08103512号

Powered by PageAdmin CMS