论文:2021,Vol:39,Issue(3):477-483
引用本文:
李泽宇, 刘卫东, 李乐, 张文博, 郭利伟. 基于RBF网络Q学习的AUV路径跟踪控制方法[J]. 西北工业大学学报
LI Zeyu, LIU Weidong, LI Le, ZHANG Wenbo, GUO Liwei. Path following method for AUV based on Q-Learning and RBF neural network[J]. Northwestern polytechnical university

基于RBF网络Q学习的AUV路径跟踪控制方法
李泽宇, 刘卫东, 李乐, 张文博, 郭利伟
西北工业大学 航海学院, 陕西 西安 710072
摘要:
水下回收过程中,AUV航行速度受到多种因素影响而产生变化,艉部操纵舵效随之改变,直接影响了AUV回收路径跟踪控制性能。根据AUV航行状态,采用强化学习方法对AUV控制器进行自主学习优化,能够改善AUV航向及深度响应的性能指标,提高路径跟踪控制性能。建立AUV路径跟踪导引律,设计航向及俯仰运动滑模控制器,保证系统对外扰动的鲁棒性;采用Q学习方法,根据AUV航速、跟踪误差及其变化率,对滑模控制参数进行离线训练优化,搭建RBF网络加快训练过程,避免"维数灾"现象;将训练得到的RBF-Q学习网络应用于在线控制,与传统滑模控制器进行跟踪控制对比。仿真结果验证了算法的有效性。
关键词:    自主水下航行器    路径跟踪    强化学习    RBF神经网络   
Path following method for AUV based on Q-Learning and RBF neural network
LI Zeyu, LIU Weidong, LI Le, ZHANG Wenbo, GUO Liwei
School of Marine Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
Abstract:
In the underwater docking process, the oscillation on AUV velocity brings extra challenge on AUV path following. A Q-learning based Sliding Mode Control (SMC) method to increase the path following performances is proposed. Firstly, AUV guidance law is designed to reduce the path following error. Heading and depth sliding mode controllers are designed to track the guidance law. Then, according to AUV velocity, tracking error and the first derivative, the control parameters of SMC are optimized via Q-learning network. RBF neural network is built to accelerate the offline learning rate. Finally, numerical simulations are made to investigate the characteristics of the present method. Comparisons are made between the trained Q-learning based SMC and the traditional SMC. The results show the effectiveness of the present method.
Key words:    autonomous underwater vehicle    path following    reinforcement learning    neural network   
收稿日期: 2020-09-07     修回日期:
DOI: 10.1051/jnwpu/20213930477
基金项目: 国家自然科学基金(61903304)与国家重点研发计划项目(2016YFC0301700)资助
通讯作者: 李乐(1986-),西北工业大学助理教授,主要从事水下机器人协同控制研究。e-mail:leli@nwpu.edu.cn     Email:leli@nwpu.edu.cn
作者简介: 李泽宇(1992-),西北工业大学博士研究生,主要从事水下航行器回收控制方法研究。
相关功能
PDF(1847KB) Free
打印本文
把本文推荐给朋友
作者相关文章
李泽宇  在本刊中的所有文章
刘卫东  在本刊中的所有文章
李乐  在本刊中的所有文章
张文博  在本刊中的所有文章
郭利伟  在本刊中的所有文章

参考文献:
[1] RIDAO P, CARRERAS M, RIBAS D, et al. Intervention AUVs:the next challenge[J]. Annual Reviews in Control, 2015, 40:227-241
[2] SHI Y, SHEN C, FANG H, et al. Advanced control in marine mechatronic systems:a survey[J]. IEEE/ASME Trans on Mechatronics, 2017,22(3):1121-1131
[3] LI Z, LIU W, GAO L, et al. Path planning method for AUV docking based on adaptive quantum-behaved particle swarm optimization[J]. IEEE Access, 2019,7:78665-78674
[4] MIN J K, WOON-KYUNG Baek, KYOUNGNAM Ha, et al. Way-point tracking for a hovering AUV by PID controller[C]//15th International Conference on Control, Automation and Systems, BEXCO, Busan, Korea, 2015
[5] 张磊. 基于遗传算法优化的水下机器人路径跟踪模糊控制技术研究[D]. 杭州:浙江大学, 2017 ZHANG Lei. Research on fuzzy control of underwater vehicle path following based on genetic algorithm optimization[D]. Hangzhou:Zhejiang University, 2017(in Chinese)
[6] 王宏健,陈子印,贾鹤鸣,等. 基于滤波反步法的欠驱动AUV三维路径跟踪控制[J]. 自动化学报, 2015,41(3):631-645 WANG Hongjian, CHEN Ziyin, JIA Heming, et al. Three-dimensional path-following control of underactuated autonomous underwater vehicle with command filtered backstepping[J]. Acta Automatica Sinica, 2015,41(3):631-645(in Chinese)
[7] 王金强,王聪,魏英杰,等. 未知海流干扰下自主水下航行器位置跟踪控制策略研究[J]. 兵工学报, 2019,40(3):583-591 WANG Jingqiang, WANG Cong, WEI Yingjie, et al. Position tracking control of autonomous underwater vehicles in the disturbance of unknown ocean currents[J]. Acta Armamentarii, 2019,40(3):583-591(in Chinese)
[8] 王金强,王聪,魏英杰,等. 欠驱动AUV自适应神经网络反步滑模跟踪控制[J]. 华中科技大学学报, 2019, 47(12):12-17 WANG Jingqiang, WANG Cong, WEI Yingjie, et al. Path following of an underactuated AUV based on adaptive neural network backstepping sliding mode control[J]. Journal of Huazhong University of Science and Technology, 2019,47(12):12-17(in Chinese)
[9] SHEN C, SHI Y, BUCKHAM B. Integrated path planning and tracking control of an AUV:a unified receding horizon optimization approach[J].IEEE/ASME Trans on Mechatronics, 2017,22(3):1163-1173
[10] SUN Y, ZHANG C, ZHANG G, et al. Three-dimensional path tracking control of autonomous underwater vehicle based on deep reinforcement learning[J]. Journal of Marine Science and Engineering, 2019, 7(12):443
[11] SHI W, SONG S, WU C, et al. Multi pseudo q-learning-based deterministic policy gradient for tracking control of autonomous underwater vehicles[J]. IEEE Trans on Neural Networks and Learning Systems, 2019, 30(12):3534-3546
[12] 姚绪梁,王晓伟. 基于MPC导引律的AUV路径跟踪和避障控制[J]. 北京航空航天大学学报, 2020,46(6):1053-1062 YAO Xuliang, WANG Xiaowei. Path following and obstacle avoidance control of AUV based on MPC guidance law[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020,46(6):1053-1062(in Chinese)
[13] 严卫生. 鱼雷航行力学[M]. 西安:西北工业大学出版社, 2005 YAN Weisheng. Torpedo navigation mechanics[M]. Xi'an:Northwestern Polytechnical Press, 2005(in Chinese)
[14] SUN Y, CHENG J, ZHANG G, et al. Mapless motion planning system for an autonomous underwater vehicle using policy gradient-based deep reinforcement learning[J]. Journal of Intelligent & Robotic Systems, 2019,96(3/4):591-601
[15] HAGAN M T, DEMUTH H B, BEALE M H. Neural network design[M]. Beijing:China Machine Press, 2002