论文:2021,Vol:39,Issue(3):510-520
引用本文:
王时雨, 张盛兵, 黄小平, 常立博. 星载SAR成像与智能处理的单片多处理架构[J]. 西北工业大学学报
WANG Shiyu, ZHANG Shengbing, HUANG Xiaoping, CHANG Libo. Single-chip multi-processing architecture for spaceborne SAR imaging and intelligent processing[J]. Northwestern polytechnical university

星载SAR成像与智能处理的单片多处理架构
王时雨, 张盛兵, 黄小平, 常立博
西北工业大学 计算机学院, 陕西 西安 710072
摘要:
星载SAR图像智能处理系统需对成像和多种不同任务应用进行在轨实时处理,设计高效专用单芯片多处理器能够有效支持实时性和低功耗的要求,片上数据组织和访存结构是设计重点。分析了SAR成像CSA(chirp scaling)和神经网络VGG-11 2种典型模型,抽象出遥感图像智能处理过程的协同计算模型。设计了一种带状Tile化数据处理方案及专用多处理架构,提出了一种Tile划分及多Tile同步拼接策略,设计了处理单元之间数据缓存结构,极大降低片外访存带宽,支持多任务模型的并行流水执行。芯片采用28 nm工艺,整体功耗仅为1.83 W,吞吐率和能效分别达到9.89 TOPS和5.4 TOPS/W。该架构可提高在轨遥感智能处理平台的实时性,降低系统设计复杂度,根据算法模型的不同,可灵活适应差异化扩展。
关键词:    单片多处理器    领域专用    智能遥感    带状数据划分    数据填充    灵活扩展   
Single-chip multi-processing architecture for spaceborne SAR imaging and intelligent processing
WANG Shiyu, ZHANG Shengbing, HUANG Xiaoping, CHANG Libo
School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
Abstract:
The satellite-borne SAR image intelligent processing system needs to process on-orbit real-time imaging and various tasks of applications, for which reason designing a dedicated high-efficient single-chip multi-processor is of prioritized necessity that can simultaneously satisfy requirements of real-time and low power consumption. Aiming at on-chip data organization and memory access structure, two typical models of SAR(synthetic aperture radar) imaging CSA (chirp scaling) and neural network VGG-11 are analyzed, and then a collaborative computing model for the intelligent processing on remote sensing is extracted. A strip Tile data processing scheme and a dedicated multi-processing architecture is not only proposed, but a data organization and a caching strategy of Tile space synchronization splicing is also presented. In addition, the designed data caching structure among the processing units greatly reduces off-chip access memory bandwidth while supporting parallel pipeline execution of multi-task model. The chip adopts 28 nm CMOS technology featuring with merely 1.83 W of the overall power consumption, whose throughput and energy efficiency reaches 9.89TOPS and 5.4 TOPS/W, respectively. The present architecture can improve real-time performance of the on-orbit remote sensing intelligent processing platform while reducing the complexity of system designing, which is highly adaptive to differentiated expansions according to different models of algorithm.
Key words:    chip multi-processors    domain-specific    intelligent remote sensing    strip tiling    data filling    flexible extensibility   
收稿日期: 2020-09-28     修回日期:
DOI: 10.1051/jnwpu/20213930510
通讯作者:     Email:
作者简介: 王时雨(1984-),西北工业大学博士研究生,主要从事高性能微处理器架构、VLSI/SOC设计与测试研究。
相关功能
PDF(2259KB) Free
打印本文
把本文推荐给朋友
作者相关文章
王时雨  在本刊中的所有文章
张盛兵  在本刊中的所有文章
黄小平  在本刊中的所有文章
常立博  在本刊中的所有文章

参考文献:
[1] HUANG P, LIAO G, YANG Z, et al. A fast SAR imaging method for ground moving target using a second-order WVD transform[J]. IEEE Trans on Geoence & Remote Sensing, 2016, 54(4):1940-1956
[2] WANG Z, LIU M, AI G, et al. Focusing of bistatic SAR with curved trajectory based on extended azimuth nonlinear chirp scaling[J]. IEEE Trans on Geoscience and Remote Sensing, 2020, 99:1-20
[3] CHEN Q, YU A, SUN Z, et al. A multi-mode space-borne SAR simulator based on SBRAS[C]//Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 2012:4567-4570
[4] KOMISKE P T, METODIEV E M, SCHWARTZ M D. Deep learning in color:towards automated quark/gluon jet discrimination[J]. Journal of High Energy Physics, 2017(1):110
[5] XU X, LI W, RAN Q, et al. Multisource remote sensing data classification based on convolutional neural network[J]. IEEE Trans on Geoence & Remote Sensing, 2018, 99:1-13
[6] KRATZERT F, KLOTZ D, BRENNER C, et al. Rainfall-runoff modelling using long short-term memory(LSTM) networks[J]. Hydrology & Earth System Sciences, 2018, 22(11):6005-6022
[7] AWAIS M, LONG X, YIN B, et al. A hybrid DCNN-SVM model for classifying neonatal sleep and wake states based on facial expression in video[J]. IEEE Journal of Biomedical and Health Informatics, 2021, 25(5):1441-1449
[8] KRIZHEVSKY A, SUTSKEVER I, HINTON G E, et al. Image net classification with deep convolutional neural networks[J]. Advances in Neural Information Proceesing Systems, 2012, 25(2):1-9
[9] HENNESSY J L, PATTERSON D A. Computer architecture:a quantitative approach[M]. 6th Edition. Cambridge:Morgan Kaufmann Publishers Inc, 2018
[10] DESOLI G. 14.1 a 2.9 TOPS/W deep convolutional neural network SoC in FD-SOI 28 nm for intelligent embedded systems[C]//2017 IEEE International Solid-State Circuits Conference, San Francisco, CA, 2017:238-239
[11] LOU Y, CLARK D, MARKS P, et al. Onboard radar processor development for rapid response to natural hazards[J]. IEEE Journal of Selected Topics in Applied Earth Obaservations and Remote Sensing, 2016, 9(6):2770-2776
[12] MOONS B, VERHELST M. A 0.3-2.6 TOPS/W precision-scalable processor for real-time large-scale ConvNets[C]//2016 IEEE Symposium on VLSI Circuits, Honolulu, HI, 2016:1-2
[13] BERT Moons, ROEL Uytterhoeven, WIM Dehaene, et al. 14.5 envision:a 0.26-to-10 TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28 nm FDSOI[C]//Solid-State Circuits Conference, 2017
[14] CHEN Y, CHEN T, XU Z, et al. DianNao family:energy-efficient hardware accelerators for machine learning[J]. Communications of the ACM, 2016, 59(11):105-112
[15] SHIN D, LEE J, LEE A, et al. 14.2 DNPU:an 8.1 TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks[C]//IEEE International Solid-state Circuits Conference, 2017
[16] YIN S. A high energy efficient reconfigurable hybrid neural network processor for deep learning applications[J]. IEEE Journal of Solid-State Circuits, 2018, 53(4):968-982
[17] YANG Chen, LI Bingyi, CHEN Liang, et al. A spaceborne synthetic aperture radar partial fixed-point imaging system using a field-programmable gate array-application-specific integrated circuit hybrid heterogeneous parallel acceleration technique[J]. Sensors, 2017, 17(7):1493-1516
[18] 李林, 张盛兵, 吴鹃. 面向图像识别的深度学习VLIW处理器设计[J]. 西北工业大学学报, 2020, 38(1):216-224 LI Lin, ZHANG Shengbing, WU Juan. Design of deep learning VLIW processor for image recognition[J]. Journal of Northwestern Polytechnical University, 2020, 38(1):216-224(in Chinese)