Real-time Detection of Surface Cracks in Steel Beam using Lightweight Convolutional Neural Network
-
摘要: 针对钢梁在工程项目中应用广泛,其表面缺陷若未能及时发现将很可能带来安全隐患。本文利用一种具有跨阶段分层结构的轻量化卷积神经网络实现了钢梁表面缺陷的快速实时检测。首先使用跨阶段局部网络搭建用于特征提取的骨干网络,不仅能丰富了梯度更新路径,而且有助于浅层表面缺陷特征的提取。其次,将跨阶段分层模块作为特征提取器嵌入到跨阶段分层结构的其中一个分支中得到轻量化的特征提取模块,极大的提高了检测速度。最后,将多尺度特征融合与YOLO层相结合完成目标检测任务。实验表明,具有跨阶段分层结构的轻量化卷积神经网络最高mAP为93.59%,帧率为30.3 s−1。在检测性能差距不大的前提下,其检测速度较YOLOv3提高了4倍,与YOLOv4相比提高了4.5倍。Abstract: Steel beams are widely used in engineering projects, and if their surface cracks are not found in time, they may bring potential safety hazards. In this paper, a lightweight convolutional neural network with a cross-stage hierarchical structure is used to achieve rapid real-time detection of surface cracks in steel beam. Firstly, a cross-stage local network to build a backbone network for feature extraction is used, which not only enriches the gradient update path, but also helps extract the shallow features of cracks. Secondly, the cross-stage layering module is embedded as a feature extractor into one of the branches of the cross-stage layered structure to obtain a lightweight feature extraction module, which greatly improves the speed of crack detection. Finally, the multi-scale feature fusion will be combined with the YOLO layer to complete the object detection. Experiments show that the highest mAP of the lightweight convolutional neural network with a cross-stage layered structure is 93.59%, and the frame rate is 30.3 s−1. Under the premise that the detection performance gap is not big, its detection speed is 4 times faster than YOLOv3 and 4.5 times faster than YOLOv4.
-
Key words:
- flaw detection /
- deep learning /
- lightweight network
-
表 1 轻量化卷积神经网络结构
layer Functional
LayerKernel Filters Input shape Output shape 0 Conv 3×3/2 32 416×416×3 208×208×32 1 Conv 3×3/2 64 208×208×32 104×104×64 2 CSP_Tiny1 − − 104×104×64 104×104×128 3 MaxPool 3×3/2 128 104×104×128 52×52×128 4 CSP_Tiny2 52×52×128 52×52×256 5 MaxPool 3×3/2 256 52×52×256 26×26×256 6 CSP_Tiny3 52×52×256 26×26×512 7 MaxPool 3×3/2 512 26×26×512 13×13×512 8 Conv 3×3/1 512 26×26×512 13×13×512 9 Conv 1×1/1 256 13×13×512 13×13×256 10 Conv 3×3/1 512 13×13×256 13×13×512 11 Predict_1 − − − − 12 Route layer_9 13×13×256 13 Conv 1×1/1 128 13×13×256 13×13×128 14 Upsample − − 13×13×128 26×26×128 15 Concat Layer_14 ♁ Layer_6 26×26×384 16 Conv 3×3/1 256 26×26×384 26×26×256 17 Predict_2 − − − − 表 2 4种不同模型的实验结果
Models Max Recall Max Precision TP FP FN mAP50 GIoU FPS BFLOPS YOLOv3 96% 94% 886 77 49 97.20% 74.98% 7.6/s 65.304 YOLOv3-Tiny 84% 94% 780 66 155 92.20% 72.02% 27.7/s 7.099 CSP-DenseNet 97% 90% 908 125 27 97.53% 73.30% 6.7/s 59.563 CSP-Tiny 89% 93% 829 73 106 93.59% 74.81% 30.3/s 6.787 -
[1] 孙小霞. 建筑物表面裂纹检测系统的设计[J]. 建筑工程技术与设计, 2020(26): 3213 doi: 10.12159/j.issn.2095-6630.2020.26.3121SUN X X. Design of building surface crack detection system[J]. Construction Engineering Technology and Design, 2020(26): 3213 (in Chinese) doi: 10.12159/j.issn.2095-6630.2020.26.3121 [2] 张辉, 宋雅男, 王耀南, 等. 钢轨缺陷无损检测与评估技术综述[J]. 仪器仪表学报, 2019, 40(2): 11-25ZHANG H, SONG Y N, WANG Y N, et al. Review of rail defect non-destructive testing and evaluation[J]. Chinese Journal of Scientific Instrument, 2019, 40(2): 11-25 (in Chinese) [3] 王森, 伍星, 张印辉, 等. 基于多尺度小波变换和结构化森林的表面裂纹分割[J]. 光学学报, 2018, 38(8): 233-242WANG S, WU X, ZHANG Y H, et al. Surface crack segmentation based on multi-scale wavelet transform and structured forest[J]. Acta Optica Sinica, 2018, 38(8): 233-242 (in Chinese) [4] WANG Y, ZHANG J Y, LIU J X, et al. Research on crack detection algorithm of the concrete bridge based on image processing[J]. Procedia Computer Science, 2019, 154: 610-616 doi: 10.1016/j.procs.2019.06.096 [5] WANG S, WU X, ZHANG Y H, et al. A neural network ensemble method for effective crack segmentation using fully convolutional networks and multi-scale structured forests[J]. Machine Vision and Applications, 2020, 31(7-8): 60 doi: 10.1007/s00138-020-01114-0 [6] REN Y P, HUANG J S, HONG Z Y, et al. Image-based concrete crack detection in tunnels using deep fully convolutional networks[J]. Construction and Building Materials, 2020, 234: 117367 doi: 10.1016/j.conbuildmat.2019.117367 [7] 谢经明, 刘默耘, 何文卓, 等. 基于轻量化YOLO的X射线焊缝图像信息检测[J]. 华中科技大学学报(自然科学版), 2021, 49(1): 1-5XIE J M, LIU M Y, HE W Z, et al. X-ray weld image information detection based on lightweight Yolo[J]. Journal of Huazhong University of science and Technology (Nature Science Edition), 2021, 49(1): 1-5 (in Chinese) [8] ZHANG J, LIANG X, WANG M, et al. Coarse-to-fine object detection in unmanned aerial vehicle imagery using lightweight convolutional neural network and deep motion saliency[J]. Neurocomputing, 2020, 398: 555-565 doi: 10.1016/j.neucom.2019.03.102 [9] XIANG H, ZHAO Y, YUAN Y L, et al. Lightweight fully convolutional network for license plate detection[J]. Optik, 2019, 178: 1185-1194 doi: 10.1016/j.ijleo.2018.10.098 [10] ZHOU Q, WANG Y, FAN Y W, et al. AGLNet: Towards real-time semantic segmentation of self-driving images via attention-guided lightweight network[J]. Applied Soft Computing, 2020, 96: 106682 doi: 10.1016/j.asoc.2020.106682 [11] LIU J, LI Q, CAO R, et al. MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 166: 255-267 doi: 10.1016/j.isprsjprs.2020.06.004 [12] YANG A P, YANG B W, JI Z, et al. Lightweight group convolutional network for single image super-resolution[J]. Information Sciences, 2020, 516: 220-233 doi: 10.1016/j.ins.2019.12.057 [13] 王艺皓, 丁洪伟, 李波, 等. 复杂场景下基于改进YOLOv3的口罩佩戴检测算法[J]. 计算机工程, 2020, 46(11): 12-22WANG Y H, DING H W, LI B, et al. Mask wearing detection algorithm based on improved YOLOv3 in complex scenes[J]. Computer Engineering, 2020, 46(11): 12-22 (in Chinese) [14] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014 [15] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149 doi: 10.1109/TPAMI.2016.2577031 [16] WEI L, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [C]//14th European Conference on Computer Vision. Amsterdam: Springer, 2016 [17] REDMON J, FARHADI A. YOLOv3: an incremental improvement[Z]. arXiv: 1804.02767, 2018 [18] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection [Z]. arXiv: 2004.10934, 2020 [19] WANG C Y, LIAO H Y M, YEH I H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle, WA, USA: IEEE, 2019 [20] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016 [21] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA: IEEE, 2017 [22] REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019