论文:2024,Vol:42,Issue(2):344-352
引用本文:
蒋栋年, 王仁杰. 基于GAN的软测量缺失数据生成方法研究[J]. 西北工业大学学报
JIANG Dongnian, WANG Renjie. Research on the generation method of missing data for soft measurement based on GAN[J]. Journal of Northwestern Polytechnical University

基于GAN的软测量缺失数据生成方法研究
蒋栋年, 王仁杰
兰州理工大学 电气工程与信息工程学院, 甘肃 兰州 730050
摘要:
针对工业过程中传感器数据缺失造成软测量模型精度低的问题,提出一种基于生成对抗网络(generative adversarial nets,GAN)的传感器缺失数据生成方法。利用孤立森林算法检测出传感器数据的缺失区域;利用缺失数据属性特征训练条件生成对抗网络(conditional generative adversarial nets,CGAN),在CGAN的输入条件中添加随机序列作为附加信息迭代送入CGAN中生成数据,并借助WGAN-GP (wasserstein generative adversarial nets gradient penalty)成本函数提高网络训练的稳定性;针对缺失区域检测结果引入采样器,将采样的数据填补进缺失区域,形成完整数据集,以提高软测量模型精度。以镍闪速炉温度传感器数据为目标变量进行软测量建模,验证所提出的提高软测量模型精度方法的可行性与有效性。
关键词:    数据缺失    孤立森林    生成对抗网络    软测量模型   
Research on the generation method of missing data for soft measurement based on GAN
JIANG Dongnian, WANG Renjie
College of Electrical and Information Engineering, Lanzhou University of Technology, Lanzhou 730050, China
Abstract:
To solve the problem of low precision in soft sensor models caused by sensor data loss in industrial processes, a new method of sensor data generation based on generative adversarial nets (GAN) is proposed. Firstly, the missing area of sensor data is detected by the isolated forest algorithm. Secondly, conditional generative adversarial nets (CGAN) are training using the attributes of missing data. By adding random sequences to the input conditions of CGAN as additional information, the data is generated iteratively in CGAN. The wasserstein generative adversarial nets gradient penalty (WGAN-GP) cost function is used to improve the stability of network training. Finally, a sampler is introduced to fill the sampled data into the missing region and form a complete data set to improve the accuracy of the soft sensing model. In this paper, the temperature sensor data of a nickel flash furnace is used as the target variable for soft-sensing modelling, and the feasibility and effectiveness of the proposed method to improve the accuracy of the soft-sensing model are verified.
Key words:    data missing    isolated forest    GAN    soft sensor model   
收稿日期: 2023-03-15     修回日期:
DOI: 10.1051/jnwpu/20244220344
基金项目: 国家自然科学基金(62263020)、甘肃省重点研发计划(23YFGA0061)、兰州市科技计划(2022-2-69)、甘肃省杰出青年基金(20JR10RA202)、兰州理工大学红柳杰出青年人才支持计划与陇原青年英才项目资助
通讯作者: 蒋栋年(1984—) e-mail:dreamjdn@126.com     Email:dreamjdn@126.com
作者简介: 蒋栋年(1984—),副教授
相关功能
PDF(2503KB) Free
打印本文
把本文推荐给朋友
作者相关文章
蒋栋年  在本刊中的所有文章
王仁杰  在本刊中的所有文章

参考文献:
[1] GOPAKUMAR V, TIWARI S, RAHMAN I. A deep learning based data driven soft sensor for bioprocesses[J]. Biochemical Engineering Journal, 2018, 136: 28-39
[2] KADLEC P, GABRYS B, STRANDT S. Data-driven soft sensors in the process industry[J]. Computers & Chemical Engineering, 2009, 33(4): 795-814
[3] SHANG C, YANG F, HUANG D, et al. Data-driven soft sensor development based on deep learning technique[J]. Journal of Process Control, 2014, 24(3): 223-233
[4] ZHU Q, HOU K, CHEN Z, et al. Novel virtual sample generation using conditional GAN for developing soft sensor with small data[J]. Engineering Applications of Artificial Intelligence, 2021, 106: 104497
[5] KHOSBAYAR A, VALLURU J, HUANG B. Multi-rate gaussian bayesian network soft sensor development with noisy input and missing data[J]. Journal of Process Control, 2021,105: 48-61
[6] LYU Y, CHEN J, SONG Z. Synthesizing labeled data to enhance soft sensor performance in data-scarce regions[J]. Control Engineering Practice, 2021, 115: 104903
[7] ZHOU X, LIU X, LAN G, et al. Federated conditional generative adversarial nets imputation method for air quality missing data[J]. Knowledge-Based Systems, 2021, 228: 107261
[8] 熊中敏, 郭怀宇, 吴月欣. 缺失数据处理方法研究综述[J]. 计算机工程与应用, 2021, 57(14): 27-38 XIONG Zhongmin, GUO Huaiyu, WU Yuexin. Review of missing data processing methods[J]. Computer Engineering and Applications, 2019,57(14): 27-38 (in Chinese)
[9] 陈景年. 选择性贝叶斯分类算法研究[D]. 北京:北京交通大学, 2008 CHEN Jingnian. Research on selective bayesian classification algorithm[D]. Beijing: Beijing Jiaotong University, 2008 (in Chinese)
[10] WANG P, CHEN X. Three-way ensemble clustering for incomplete data[J]. IEEE Access, 2020, 8: 91855-91864
[11] ELREEDY D, ATIYA A F. A comprehensive analysis of synthetic minority oversampling technique(SMOTE) for handling class imbalance[J]. Information Sciences, 2019, 505: 32-64
[12] JIANG J, ZHOU H, ZHANG T, et al. Machine learning to predict dynamic changes of pathogenic vibrio spp.abundance on microplastics in marine environment[J]. Environmental Pollution, 2022, 305: 119257
[13] YU Y, SRIVASTAVA A, CANALES S. Conditional LSTM-GAN for melody generation from lyrics[J]. ACM Trans on Multimedia Computing Communications and Applications, 2021, 17(1): 1-20
[14] YAO Z, ZHAO C. FIGAN: a missing industrial data imputation method customized for soft sensor application[J]. IEEE Trans on Automation Science and Engineering, 2021, 19(4): 3712-3722
[15] WANG X. Data preprocessing for soft sensor using generative adversarial networks[C]//15th International Conference on Control, Automation, Robotics and Vision, 2018: 1355-1360
[16] LIU F T, TING K M, ZHOU Z. Isolation forest[C]//2008 Eighth IEEE International Conference on Data Mining, 2008
[17] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144
[18] MIRZA M, OSINDERO S. Conditional generative adversarial nets[J/OL]. (2014-11-06)[2023-02-15]. https://arxiv.org/abs/1411.1784