论文:2022,Vol:40,Issue(4):944-952
引用本文:
王俊, 王鹏, 李晓艳, 王梁, 孙梦宇, 郜辉. 融合多阶语义增强的JDE多目标跟踪算法[J]. 西北工业大学学报
WANG Jun, WANG Peng, LI Xiaoyan, WANG Liang, SUN Mengyu, GAO Hui. JDE multi-object tracking algorithm integrating multi-level semantic enhancement[J]. Northwestern polytechnical university

融合多阶语义增强的JDE多目标跟踪算法
王俊1, 王鹏2, 李晓艳1, 王梁3, 孙梦宇1, 郜辉1
1. 西安工业大学 电子信息工程学院, 陕西 西安 710021;
2. 西安工业大学 发展规划处, 陕西 西安 710021;
3. 陕西航天技术应用研究院有限公司, 陕西 西安 710100
摘要:
为了解决联合检测和嵌入(JDE)算法中目标遮挡以及ID信息与位置信息提取不足造成的目标ID切换问题,提出了融合多阶语义增强的JDE多目标跟踪方法。采用SPA特征空间金字塔注意力模块扩大感受野,获得更丰富的语义信息,提高模型对不同尺度目标的检测精度;通过FCN网络使检测头和ID Embedding任务协同学习以缓解两者的过度竞争并增强原始语义信息,有效减少ID切换次数;利用PCCs-Ma运动度量加强卡尔曼滤波的预测和观察之间的联系,提高运动特征相似度判别的可靠性。为了验证算法的有效性,设计了相同实验环境下JDE算法和所提算法的对比实验。实验结果表明,所提算法模型检测平均精度提高了3.94%。在MOT16数据集上,MOTA和IDF1指标均提高了6.9%,改进后的算法ID切换次数明显减少,取得了良好的跟踪效果。
关键词:    多目标跟踪    JDE算法    语义信息    SPA    感受野   
JDE multi-object tracking algorithm integrating multi-level semantic enhancement
WANG Jun1, WANG Peng2, LI Xiaoyan1, WANG Liang3, SUN Mengyu1, GAO Hui1
1. School of Electronics and Information Engineering, Xi'an Technological University, Xi'an 710021, China;
2. Development Planning Service, Xi'an Technological University, Xi'an 710021, China;
3. Shaanxi Academy of Aerospace Technology Application Co., Ltd, Xi'an 710100, China
Abstract:
In order to solve the problem of target ID switching caused by target occlusion and insufficient ID information and location information extraction in JDE(joint detection and embedding) algorithm, an improved multi-target tracking algorithm based on JDE is proposed in this paper. Firstly, the SPA feature space pyramid attention module is used to expand the receptive field and obtain more abundant semantic information to improve the detection accuracy of the model for different scale targets. Secondly, the FCN network makes the header and ID Embedding task collaborative learning to alleviate the excessive competition and enhance the original semantic information, effectively reducing the number of ID switching. Finally, PCCs-Ma motion measurement can strengthen the connection between Kalman filtering prediction and observation, and improve the reliability of similarity discrimination of motion characteristics. In order to verify the effectiveness of the algorithm, the JDE algorithm and the proposed algorithm are compared in the same experimental environment. The experimental results show that the average accuracy of model detection is improved by 3.94 %. On the MOT16 dataset, the MOTA and IDF1 indexes are increased by 6.9 %, and the number of ID switching of the improved algorithm is significantly reduced, and good tracking results are achieved.
Key words:    multi-object tracking    JDE algorithm    semantic information    SPA    receptive field   
收稿日期: 2021-11-03     修回日期:
DOI: 10.1051/jnwpu/20224040944
基金项目: 国家自然科学基金(62171360)、陕西省科技厅重点研发计划(2022GY-110)与西安工业大学校长基金面上培育项目(XGPY200217)资助
通讯作者: 王鹏(1978-),西安工业大学教授,主要从事机器视觉、模式识别及图像处理研究。e-mail:wang_peng@xatu.edu.cn     Email:wang_peng@xatu.edu.cn
作者简介: 王俊(1997-),西安工业大学硕士研究生,主要从事视觉目标跟踪研究。
相关功能
PDF(4132KB) Free
打印本文
把本文推荐给朋友
作者相关文章
王俊  在本刊中的所有文章
王鹏  在本刊中的所有文章
李晓艳  在本刊中的所有文章
王梁  在本刊中的所有文章
孙梦宇  在本刊中的所有文章
郜辉  在本刊中的所有文章

参考文献:
[1] XIAO Tong, LI Shuang, WANG Bochao, et al. Joint detection and identification feature learning for person search[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3415-3424
[2] WANG Zhongdao, ZHENG Liang, LIU Yixuan, et al. Towards real-time multi-object tracking[C]//Computer Vision-European Conference on Computer Vision, 2020: 107-122
[3] ZHANG Yifu, WANG Chunyu, WANG Xinggang, et al. FairMOT: on the fairness of detection and re-identification in multiple object tracking[J]. International Journal of Computer Vision, 2021, 129(11): 3069-3087
[4] CHAABANE M, ZHANG P, BEVERIDGE J R, et al. DEFT: detection embeddings for tracking[EB/OL]. (2021-02-03)[2021-11-01]. https://arxiv.org/abs/2102.02267
[5] GUO Song, WANG Jingya, WANG Xinchao, et al. Online multiple object tracking with cross-task synergy[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, 2021: 8132-8141
[6] REDMON J, FARHADI A. Yolov3: An incremental improvement[EB/OL]. (2018-04-08)[2021-11-01]. https://arxiv.org/abs/1804.02767
[7] KALMAN R E. A new approach to linear filtering and prediction problems[J]. Journal of Basic Engineering, 1960, 82: 35-45
[8] KUHN H W. The hungarian method for the assignment problem[J]. Naval Research Logistics Quarterly, 1955, 2(1/2): 83-97
[9] ZHANG Xuan, LUO Hao, FAN Xing, et al. AlignedReID: surpassing human-level performance in person re-identification[J/OL]. (2017-11-22)[2021-11-01]. https://arxiv.org/abs/1711.08184
[10] GUO Jingda, MA Xu, SANSOM A, et al. Spanet: spatial pyramid attention network for enhanced image recognition[C]//2020 IEEE International Conference on Multimedia and Expo, London, 2020: 1-6
[11] LIN T Y, DOLLAR P, GIRSHICK R. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 936-944
[12] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Trans on Pattern Analysis & Machine Intelligence, 2014, 37(9): 1904-1916
[13] DJORK-ARNÉ C, UNTERTHINER T, HOCHREITER S. Fast and accurate deep network learning by exponential linear units(ELUs)[C]//International Conference on Learning Representations, San Juan, Puerto Rico, 2016: 1-14
[14] XIAO T, LI S, WANG B, et al. Joint detection and identification feature learning for person search[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3415-3424
[15] ZHENG L, ZHANG H, SUN S, et al. Person re-identification in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1367-1376
[16] MILAN A, LEAL-TAIXE L, REID L, et al. Mot16: a benchmark for multi-object tracking[J/OL]. (2016-03-02)[2021-11-01]. https://arxiv.org/abs/1603.00831
[17] LEAL-TAIXE L, MILAN A, REID I, et al, MOTChallenge 2015: towards a benchmark for multi-target tracking[EB/OL].(2015-04-08)[2021-11-01]. https://arxiv.org/abs/1504.01942
[18] DENDORFER P, REZATOFIGHI H, MILAN A, et al. MOT20: a benchmark for multi object tracking in crowded scenes[EB/OL].(2020-03-19)[2021-11-01]. https://arxiv.org/abs/2003.09003
[19] WOJKE N, BEWLEY A, PAULUS D. Simple online and realtime tracking with a deep association metric[C]//2017 IEEE International Conference on Image Processing, 2017: 3645-3649