论文:2015,Vol:33,Issue(2):332-336
引用本文:
郑炜, 吴潇雪, 谭鑫, 彭耀鹏, 杨帅. 基于半监督学习方法的软件故障定位研究[J]. 西北工业大学学报
Zheng Wei, Wu Xiaoxue, Tan Xin, Peng Yaopeng, Yang Shuai. Software Fault Localization Using Semi-Supervised Learning[J]. Northwestern polytechnical university

基于半监督学习方法的软件故障定位研究
郑炜, 吴潇雪, 谭鑫, 彭耀鹏, 杨帅
西北工业大学软件与微电子学院, 陕西西安 710072
摘要:
故障定位是软件工程中最为耗时和昂贵的活动之一,为降低软件故障定位的成本及提高故障定位的效率,机器学习方法被广泛应用于自动化软件故障定位中。传统的监督学习方法需要获取大量标记样本,这在实际项目中相当困难,且费用高昂。针对这一问题,提出采用半监督学习方法进行软件故障定位的思想,故障定位基于语句级别,通过应用程序中可执行语句与测试用例执行之间动态属性、以及传统软件故障定位中较有效的若干静态属性实现协同训练目的,得到训练良好的分类器,然后用该分类器对程序其余语句进行分类,从而得到故障语句。文章最后在Siemens Suite数据集中对算法进行验证,通过与传统监督学习算法进行对比,证明半监督学习算法在软件故障定位中的有效性。
关键词:    软件故障定位    半监督学习    协同训练算法    训练样本   
Software Fault Localization Using Semi-Supervised Learning
Zheng Wei, Wu Xiaoxue, Tan Xin, Peng Yaopeng, Yang Shuai
Department of Software and Microelectronic Engineering, Northwestern Polytechnical University, Xi'an 710072, China
Abstract:
In order to improve the efficiency of software fault localization, supervised learning methods are widely used in automatic software fault localization. But these methods mostly ignore a very important fact:in order to train a good performance of the classifier through supervised learning method, there must be a large number of labeled samples. While in the actual project, to obtain a large number of labeled samples is quite difficult; even if it can be done, the cost is very high. In order to solve this problem, we propose a semi supervised learning algorithm for software fault location. We adopt a high-creditability and collaborative style of semi supervised learning algorithm named Co-Trade, which uses the dynamic attributes between the programs' executable statements and test case execution as well as some effective static attributes of traditional software fault localization to achieve the purpose of cooperative training. Finally, selecting the Siemens Suite as the test data, we prove the validity of Co-Trade algorithm in software fault localization by comparing it with the traditional supervised learning methods.
Key words:    backpropagation algorithms    classification (of information)    classifiers    cost reduction    decision trees    errors    fault detection    MATLAB    software engineering    software testing    support vector machines    Co-Trade    dynamic attributes between the programs' executable statements and test case execution    semi-supervised learning    software fault localization    training data   
收稿日期: 2014-10-28     修回日期:
DOI:
通讯作者:     Email:
作者简介: 郑炜(1975-),西北工业大学副教授,主要从事软件工程、软件测试的研究。
相关功能
PDF(941KB) Free
打印本文
把本文推荐给朋友
作者相关文章
郑炜  在本刊中的所有文章
吴潇雪  在本刊中的所有文章
谭鑫  在本刊中的所有文章
彭耀鹏  在本刊中的所有文章
杨帅  在本刊中的所有文章

参考文献:
[1] Binkley D. Source Code Analysis:a Road Map[C] //Proceedings of Future of Software Engineering, Minneapolis, USA, 2007:104-119
[2] Zhang M L, Zhou Z H. CoTrade:Confident Co-Training with Data Editing[J]. IEEE Trans on Systems, Man, and Cybernetics, Part B:Cybernetics, 2011, 41(6):1612-1626
[3] Ali S, Andrews J H, Dhandapani T, et al. Evaluating the Accuracy of Fault Localization Techniques[C] //Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, 2009:76-87
[4] Wong W E, Debroy V, Golden R, et al. Effective Software Fault Localization Using an RBF Neural Network[J]. IEEE Trans on Reliability, 2012, 61(1):149-169
[5] Wong W E, Qi Y. BP Neural Network-Based Effective Fault Localization[J]. International Journal of Software Engineering and Knowl6edge Engineering, 2009, 19(4):573-597
[6] Briand L C, Labiche Y, Liu X. Using Machine Learning to Support Debugging with Tarantula[C] //The 18th IEEE International Symposium on Software Reliability, 2007:137-146
[7] Jones J A, Harrold M J, Stasko J. Visualization of Test Information to Assist Fault Localization[C] //Proceedings of the 24th International Conference on Software Engineering, 2002:467-477