论文:2023,Vol:41,Issue(6):1190-1197
引用本文:
刘明阳, 杨啟明, 胡冠华, 郭岩, 张建东. 基于Transformer的3D点云目标检测算法[J]. 西北工业大学学报
LIU Mingyang, YANG Qiming, HU Guanhua, GUO Yan, ZHANG Jiandong. 3D point cloud object detection algorithm based on Transformer[J]. Journal of Northwestern Polytechnical University

基于Transformer的3D点云目标检测算法
刘明阳1, 杨啟明2, 胡冠华2,3, 郭岩4, 张建东2
1. 沈阳飞机设计研究所, 辽宁 沈阳 110035;
2. 西北工业大学 电子信息学院, 陕西 西安 7101291;
3. 中国船舶集团有限公司系统工程研究院, 北京 100094;
4. 空装驻沈阳地区第一军事代表室, 辽宁 沈阳 110850
摘要:
针对在三维目标检测中由于空间维度的增加基于锚框的方法难以部署的问题,研究了基于集合预测的点云目标检测算法。提出一种基于Transformer的3D点云目标检测算法,并结合自动驾驶场景下的点云特点,提出了改进空间调制注意力和热图初始化策略进行训练加速和查询初始化,在浅层网络下取得了良好的检测性能。在KITTI数据集上与其他算法进行比较,结果表明所提算法在性能上已经达到先进水平,进一步对算法中的主要组成部分进行了消融实验,验证了各个模块对检测效果的贡献。
关键词:    Transformer    空间调制注意力机制    热图初始化    目标检测    深度学习   
3D point cloud object detection algorithm based on Transformer
LIU Mingyang1, YANG Qiming2, HU Guanhua2,3, GUO Yan4, ZHANG Jiandong2
1. Shenyang Aircraft Design Research Institute, Shenyang 110035, China;
2. School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China;
3. CSSC Systems Engineering Research Institute, Beijing 100094, China;
4. No. 1 Military Representative Office of Equipment Department of PLA Airforce in Shenyang, Shenyang 110850, China
Abstract:
In response to the difficulty in deploying anchor box based methods in 3D object detection due to the increase in spatial dimensions, this paper studies a point cloud object detection algorithm based on set prediction. This article proposes a Transformer based 3D point cloud object detection algorithm, and combines the characteristics of point clouds in autonomous driving scenarios to propose an improved spatial modulation attention and heat map initialization strategy for training acceleration and query initialization, achieving good detection performance in shallow networks. This article compares it with other algorithms on the KITTI dataset, and the results show that our algorithm has reached an advanced level in performance. We also conducted ablation experiments on the main components of the algorithm to verify the contribution of each module to the detection effect.
Key words:    Transformer    spatial modulation attention mechanism    heat map initialization    target detection    deep learning   
收稿日期: 2023-01-09     修回日期:
DOI: 10.1051/jnwpu/20234161190
基金项目: 陕西省自然科学基础研究计划(2022JQ-593)与陕西省重点研发计划(2022GY-089)资助
通讯作者: 杨啟明(1988-),西北工业大学助理研究员,主要从事人工智能与自主决策控制研究。e-mail:yangqm@nwpu.edu.cn     Email:yangqm@nwpu.edu.cn
作者简介: 刘明阳(1986-),沈阳飞机设计研究所高级工程师,主要从事航空电子系统设计研究。
相关功能
PDF(2954KB) Free
打印本文
把本文推荐给朋友
作者相关文章
刘明阳  在本刊中的所有文章
杨啟明  在本刊中的所有文章
胡冠华  在本刊中的所有文章
郭岩  在本刊中的所有文章
张建东  在本刊中的所有文章

参考文献:
[1] 李柯泉, 陈燕, 刘佳晨, 等. 基于深度学习的目标检测算法综述[J]. 计算机工程, 2022, 48(7): 1-12 LI Kequan, CHEN Yan, LIU Jiachen, et al. Survey of deep learning-based object detection algorithms[J]. Computer Engineering, 2022, 48(7): 1-12 (in Chinese)
[2] 董文轩, 梁宏涛, 刘国柱, 等. 深度卷积应用于目标检测算法综述[J]. 计算机科学与探索, 2022, 16(5): 1025-1042 DONG Wenxuan, LIANG Hongtao, LIU Guozhu, et al. Review of deep convolution applied to target detection algorithms[J]. Journal of Frontiers of Computer Science and Technology, 2022,16(5): 1025-1042 (in Chinese)
[3] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]//31st International Conference on Neural Information Processing Systems, New York, 2017: 6000-6010
[4] KIRILLOV A, USUNIER N, CARION N, et al. End-to-end object detection with transformers[C]//2020 European Conference on Computer Vision, Cham, 2020: 213-229
[5] 周全, 倪英豪, 莫玉玮, 等. FMA-DETR:一种无编码器的Transformer目标检测方法[J/OL]. (2023-10-16)[2023-11-30]. https://link.cnki.net/urlid/11.2406.TN.20231012.1541.014 ZHOU Quan, NI Yinghao, MO Yuwei, et al. FMA-DETR: a Transformer object detection method without encoder[J/OL]. (2023-10-16)[2023-11-30]. https://link.cnki.net/urlid/11.2406.TN.20231012.1541.014 (in Chinese)
[6] 廖峻霜, 谭钦红. 多粒度空间注意力与空间先验监督的DETR[J/OL]. (2023-09-26)[2023-11-30]. https://link.cnki.net/urlid/50.1075.tp.20230925.0916.008 LIAO Junshuang, TAN Qinghong. DETR with multi-granularity spatial attention and spatial prior supervision[J/OL].(2023-09-26)[2023-11-30]. https://link.cnki.net/urlid/50.1075.tp.20230925.0916.008 (in Chinese)
[7] YAO Z, AI J, LI B, et al. Efficient DETR: improving end-to-end object detector with dense prior[J].(2021-08-03)[2023-01-09]. https://doi.org/10.48550/arXiv.2104.01318
[8] DUAN K, BAI S, XIE L, et al. CenterNet: keypoint triplets for object detection[C]//2019 IEEE/CVF International Confer-ence on Computer Vision, Piscataway, 2019: 6568-6577
[9] ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[C]//International Conference on Learning Representations, Montreal, 2020
[10] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//13th European Conference on Computer Vision, Piscataway, 2014: 740-755
[11] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149
[12] 朱张莉, 饶元, 吴渊, 等. 注意力机制在深度学习中的研究进展[J]. 中文信息学报, 2019, 33(6): 1-11 ZHU Zhangli, RAO Yuan, WU Yuan, et al. Research progress of attention mechanism in deep learning[J]. Journal of Chinese Information Processing, 2019, 33(6): 1-11 (in Chinese)
[13] GAO P, ZHENG M, WANG X, et al. Fast convergence of DETR with spatially modulated co-attention[C]//2021 International Conference on Computer Vision, Piscataway, 2021: 3601-3610
[14] 刘庆雯. 基于Transformer矢量化高清地图的构建[D]. 沈阳:辽宁大学, 2023 LIU Qingwen. Construction of vectorized HD map based on transformer[D]. Shenyang: Liaoning University, 2023 (in Chinese)
[15] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327
[16] ZHOU D, FANG J, SONG X, et al. IoU loss for 2D/3D object detection[C]//2019 International Conference on 3D Vision, Piscataway, 2019: 85-94
[17] GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition, Piscataway, 2012: 3354-3361
相关文献:
1.张艳, 李星汕, 孙叶美, 刘树东.基于通道注意力与特征融合的水下目标检测算法[J]. 西北工业大学学报, 2022,40(2): 433-441