基于Transformer的3D点云目标检测算法 -- 西北工业大学学报,2023,41(6):1190-1197

	论文:2023,Vol:41,Issue(6):1190-1197
	引用本文：
	刘明阳, 杨啟明, 胡冠华, 郭岩, 张建东. 基于Transformer的3D点云目标检测算法[J]. 西北工业大学学报
	LIU Mingyang, YANG Qiming, HU Guanhua, GUO Yan, ZHANG Jiandong. 3D point cloud object detection algorithm based on Transformer[J]. Journal of Northwestern Polytechnical University

基于Transformer的3D点云目标检测算法

刘明阳¹, 杨啟明², 胡冠华^2,3, 郭岩⁴, 张建东²

1. 沈阳飞机设计研究所, 辽宁沈阳 110035;
2. 西北工业大学电子信息学院, 陕西西安 7101291;
3. 中国船舶集团有限公司系统工程研究院, 北京 100094;
4. 空装驻沈阳地区第一军事代表室, 辽宁沈阳 110850

摘要:

针对在三维目标检测中由于空间维度的增加基于锚框的方法难以部署的问题,研究了基于集合预测的点云目标检测算法。提出一种基于Transformer的3D点云目标检测算法,并结合自动驾驶场景下的点云特点,提出了改进空间调制注意力和热图初始化策略进行训练加速和查询初始化,在浅层网络下取得了良好的检测性能。在KITTI数据集上与其他算法进行比较,结果表明所提算法在性能上已经达到先进水平,进一步对算法中的主要组成部分进行了消融实验,验证了各个模块对检测效果的贡献。

关键词: Transformer 空间调制注意力机制热图初始化目标检测深度学习

3D point cloud object detection algorithm based on Transformer

LIU Mingyang¹, YANG Qiming², HU Guanhua^2,3, GUO Yan⁴, ZHANG Jiandong²

1. Shenyang Aircraft Design Research Institute, Shenyang 110035, China;
2. School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China;
3. CSSC Systems Engineering Research Institute, Beijing 100094, China;
4. No. 1 Military Representative Office of Equipment Department of PLA Airforce in Shenyang, Shenyang 110850, China

Abstract:

In response to the difficulty in deploying anchor box based methods in 3D object detection due to the increase in spatial dimensions, this paper studies a point cloud object detection algorithm based on set prediction. This article proposes a Transformer based 3D point cloud object detection algorithm, and combines the characteristics of point clouds in autonomous driving scenarios to propose an improved spatial modulation attention and heat map initialization strategy for training acceleration and query initialization, achieving good detection performance in shallow networks. This article compares it with other algorithms on the KITTI dataset, and the results show that our algorithm has reached an advanced level in performance. We also conducted ablation experiments on the main components of the algorithm to verify the contribution of each module to the detection effect.

Key words: Transformer spatial modulation attention mechanism heat map initialization target detection deep learning

收稿日期: 2023-01-09 修回日期:

DOI: 10.1051/jnwpu/20234161190

基金项目: 陕西省自然科学基础研究计划(2022JQ-593)与陕西省重点研发计划(2022GY-089)资助

通讯作者: 杨啟明(1988-),西北工业大学助理研究员,主要从事人工智能与自主决策控制研究。e-mail:yangqm@nwpu.edu.cn Email：yangqm@nwpu.edu.cn

作者简介: 刘明阳(1986-),沈阳飞机设计研究所高级工程师,主要从事航空电子系统设计研究。

相关功能

PDF(2954KB) Free

打印本文

把本文推荐给朋友

作者相关文章

刘明阳 在本刊中的所有文章

杨啟明 在本刊中的所有文章

胡冠华 在本刊中的所有文章

郭岩在本刊中的所有文章

张建东 在本刊中的所有文章


	参考文献:
	[1] 李柯泉, 陈燕, 刘佳晨, 等. 基于深度学习的目标检测算法综述[J]. 计算机工程, 2022, 48(7): 1-12 LI Kequan, CHEN Yan, LIU Jiachen, et al. Survey of deep learning-based object detection algorithms[J]. Computer Engineering, 2022, 48(7): 1-12 (in Chinese) [2] 董文轩, 梁宏涛, 刘国柱, 等. 深度卷积应用于目标检测算法综述[J]. 计算机科学与探索, 2022, 16(5): 1025-1042 DONG Wenxuan, LIANG Hongtao, LIU Guozhu, et al. Review of deep convolution applied to target detection algorithms[J]. Journal of Frontiers of Computer Science and Technology, 2022,16(5): 1025-1042 (in Chinese) [3] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]//31st International Conference on Neural Information Processing Systems, New York, 2017: 6000-6010 [4] KIRILLOV A, USUNIER N, CARION N, et al. End-to-end object detection with transformers[C]//2020 European Conference on Computer Vision, Cham, 2020: 213-229 [5] 周全, 倪英豪, 莫玉玮, 等. FMA-DETR:一种无编码器的Transformer目标检测方法[J/OL]. (2023-10-16)[2023-11-30]. https://link.cnki.net/urlid/11.2406.TN.20231012.1541.014 ZHOU Quan, NI Yinghao, MO Yuwei, et al. FMA-DETR: a Transformer object detection method without encoder[J/OL]. (2023-10-16)[2023-11-30]. https://link.cnki.net/urlid/11.2406.TN.20231012.1541.014 (in Chinese) [6] 廖峻霜, 谭钦红. 多粒度空间注意力与空间先验监督的DETR[J/OL]. (2023-09-26)[2023-11-30]. https://link.cnki.net/urlid/50.1075.tp.20230925.0916.008 LIAO Junshuang, TAN Qinghong. DETR with multi-granularity spatial attention and spatial prior supervision[J/OL].(2023-09-26)[2023-11-30]. https://link.cnki.net/urlid/50.1075.tp.20230925.0916.008 (in Chinese) [7] YAO Z, AI J, LI B, et al. Efficient DETR: improving end-to-end object detector with dense prior[J].(2021-08-03)[2023-01-09]. https://doi.org/10.48550/arXiv.2104.01318 [8] DUAN K, BAI S, XIE L, et al. CenterNet: keypoint triplets for object detection[C]//2019 IEEE/CVF International Confer-ence on Computer Vision, Piscataway, 2019: 6568-6577 [9] ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[C]//International Conference on Learning Representations, Montreal, 2020 [10] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//13th European Conference on Computer Vision, Piscataway, 2014: 740-755 [11] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149 [12] 朱张莉, 饶元, 吴渊, 等. 注意力机制在深度学习中的研究进展[J]. 中文信息学报, 2019, 33(6): 1-11 ZHU Zhangli, RAO Yuan, WU Yuan, et al. Research progress of attention mechanism in deep learning[J]. Journal of Chinese Information Processing, 2019, 33(6): 1-11 (in Chinese) [13] GAO P, ZHENG M, WANG X, et al. Fast convergence of DETR with spatially modulated co-attention[C]//2021 International Conference on Computer Vision, Piscataway, 2021: 3601-3610 [14] 刘庆雯. 基于Transformer矢量化高清地图的构建[D]. 沈阳:辽宁大学, 2023 LIU Qingwen. Construction of vectorized HD map based on transformer[D]. Shenyang: Liaoning University, 2023 (in Chinese) [15] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327 [16] ZHOU D, FANG J, SONG X, et al. IoU loss for 2D/3D object detection[C]//2019 International Conference on 3D Vision, Piscataway, 2019: 85-94 [17] GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition, Piscataway, 2012: 3354-3361

	相关文献:
	1．张艳, 李星汕, 孙叶美, 刘树东.基于通道注意力与特征融合的水下目标检测算法[J]. 西北工业大学学报, 2022,40(2): 433-441

邮编:710072 电话：029-88495455 Email：xuebao@nwpu.edu.cn

本系统由北京仁和汇智信息技术有限公司设计开发技术支持：info@rhhz.net