论文:2017,Vol:35,Issue(5):876-883
引用本文:
刘飞, 张绍武, 高红艳. 基于部分互信息和贝叶斯打分函数的基因调控网络构建算法[J]. 西北工业大学学报
Liu Fei, Zhang Shaowu, Gao Hongyan. Inferring Gene Regulatory Networks Based on Part Mutual Information and Bayesian Scoring Function[J]. Northwestern polytechnical university

基于部分互信息和贝叶斯打分函数的基因调控网络构建算法
刘飞1,2, 张绍武1, 高红艳2
1. 西北工业大学 自动化学院 信息融合教育部重点实验室, 陕西 西安 710072;
2. 宝鸡文理学院 物理与光电技术学院, 陕西 宝鸡 721016
摘要:
从基因表达数据出发重构基因调控网络,可有效挖掘基因间调控关系,深层次地理解生物调控过程。传统的相关性系数模型、偏相关系数模型仅能发现基因间线性关系,而互信息和条件互信息可用于发现基因间的非线性关系,且能够处理高维低样本基因表达数据。但互信息过高估计基因间的相关性,条件互信息过低估计基因间的相关性,从而导致推断出的基因网络假阳性率和假阴性率较高,且不能推断基因调控方向。因而,基于部分互信息和贝叶斯打分函数,提出一种新的基因调控网络构建算法(命名为PMIBSF)。基于部分互信息,PMIBSF算法首先删除初始基因相关网络中的冗余关联边,然后采用贝叶斯网络互信息测试打分函数学习贝叶斯网络结构,快速构建基因调控网络。在计算机模拟网络和真实生物分子网络上,仿真实验结果表明:PMIBSF性能优于目前较流行的LP、PC-alg、NARROMI和ARACNE算法,可高精度构建基因调控网络。
关键词:    部分互信息    互信息测试打分    贝叶斯网络    协方差矩阵    基因调控网络   
Inferring Gene Regulatory Networks Based on Part Mutual Information and Bayesian Scoring Function
Liu Fei1,2, Zhang Shaowu1, Gao Hongyan2
1. Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China;
2. Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Science, Baoji 721016, China
Abstract:
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. The most widely used criteria are the Pearson correlation coefficient and partial correlation, but they can only measure linearly direct association and miss nonlinear associations. Mutual information (MI) and conditional Mutual information (CMI) not only can overcome those disadvantages, but also can process the gene expression data which are high dimensional and low samples. MI and CMI are widely used in quantifying both linear and nonlinear associations, but they suffer from the serious problems of overestimation and underestimation. GRNS based on MI and CMI suffer from higher false-positive and false-negative problem and can't identify the directions of regulatory interactions. By using the partial mutual information (PMI) and Bayesian scoring function (BSF), in this work, we present a novel algorithm (namely PMIBSF). Tested on the Synthetic networks as well as real biological molecular networks with different sizes and topologies, the results show that PMIBSF can infer RGNs with higher accuracy. The PMIBSF's performance outperforms other state-of-the-art methods, such as LP, PC-alg, NARROMI and ARACNE.
Key words:    part mutual information    mutual information test Scoring    Bayesian network    covariance matrix    gene regulatory network   
收稿日期: 2017-03-01     修回日期:
DOI:
基金项目: 国家自然科学基金(91430111、61473232、61170134)资助
通讯作者:     Email:
作者简介: 刘飞(1981-),西北工业大学博士研究生,主要从事生物信息学研究。
相关功能
PDF(1123KB) Free
打印本文
把本文推荐给朋友
作者相关文章
刘飞  在本刊中的所有文章
张绍武  在本刊中的所有文章
高红艳  在本刊中的所有文章

参考文献:
[1] Stuart J M, Segal E, Koller D, et al. A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules[J]. Science, 2003, 302(5643):249-255
[2] Wu J, Zhao X, Lin Z, et al. Large Scale Gene Regulatory Network Inference with a Multi-Level Strategy[J]. Molecular Biosystems, 2016, 12(2):588-597
[3] Liu F, Zhang S W, Guo W F, et al. Inference of Gene Regulatory Network Based on Local Bayesian Networks[J]. PLos Computational Biology, 2016, 12(8):e1005024
[4] Sakamoto E, Iba H. Inferring a System of Differential Equations for a Gene Regulatory Network By Using Genetic Programming[C]//Proceedings of the 2001 Congress on Evolutionary, 2001:720-726
[5] Huynhthu V A, Irrthum A, Wehenkel L, et al. Inferring Regulatory Networks from Expression Data Using Tree-Based Methods[J]. Plos One, 2010, 5(9):4439-4451
[6] Shmulevich I, Dougherty E R, Kim S, et al. Probabilistic Boolean Networks:a Rule-Based Uncertainty Model for Gene Regulatory Networks[J]. Bioinformatics, 2002, 18(2):261-274
[7] Honkela A, Girardot C, Gustafson E H, et al. Model-Based Method for Transcription Factor Target Identification with Limited Data[J]. Proceedings of the National Academy of Sciences, 2010, 107(17):7793-7798
[8] Zhu H, Rao R S P, Zeng T, et al. Reconstructing Dynamic Gene Regulatory Networks from Sample-Based Transcriptional Data[J]. Nucleic acids research, 2012, 40(21):10657-10667
[9] Young W C, Raftery A E, Yeung K Y. Fast Bayesian Inference for Gene Regulatory Networks Using ScanBMA[J]. BMC Systems Biology, 2014, 8(1):47-47
[10] Barzel B, Barabási A L. Network Link Prediction by Global Silencing of Indirect Correlations[J]. Nature Biotechnology, 2013, 31(8):720-725
[11] Zhang X, Liu K, Liu Z P, et al. NARROMI:a Noise and Redundancy Reduction Technique Improves Accuracy of Gene Regulatory Network Inference[J]. Bioinformatics, 2013, 29(1):106-113
[12] Margolin A A, Nemenman I, Basso K, et al. ARACNE:an Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context[J]. BMC Bioinformatics, 2006, 7(1):S7
[13] Zhao J, Zhou Y, Zhang X, et al. Part Mutual Information for Quantifying Direct Associations in Networks[J]. Proceedings of the National Academy of Sciences, 2016, 113(18):5130-5135
[14] Cooper G F, Herskovits E. A Bayesian Method for the Induction of Probabilistic Networks from Data[J]. Machine Learning, 1992, 9(4):309-347
[15] Heckerman D, Geiger D, Chickering D M. Learning Bayesian Networks:The Combination of Knowledge and Statistical Data[J]. Machine Learning, 1995, 20(3):197-243
[16] Schwarz G. Estimating the Dimension of a Model[J]. The Annals of Statistics, 1978, 6(2):461-464
[17] Hansen M H, Yu B. Model Selection and the Principle of Minimum Description Length[J]. Journal of the American Statistical Association, 2001, 96(454):746-774
[18] Lam W, Bacchus F. Learning Bayesian Belief Networks:an Approach Based on the Mdl Principle[J]. Computational Intelligence, 1994, 10(3):269-293
[19] Basso K, Margolin A A, Stolovitzky G, et al. Reverse Engineering of Regulatory Networks in Human B Cells[J]. Nature Genetics, 2005, 37(4):382-390
[20] Janzing D, Balduzzi D, Grosse-Wentrup M, et al. Quantifying Causal Influences[J]. The Annals of Statistics, 2013, 41(5):2324-2358
[21] Schreiber T. Measuring Information Transfer[J]. Physical Review Letters, 2000, 85(2):461-4
[22] Campos L M. A Scoring Function for Learning Bayesian Networks Based on Mutual Information and Conditional Independence Tests[J]. Journal of Machine Learning Research, 2006, 7(2):2149-2187
[23] Schaffter T, Marbach D, Floreano D. GeneNetWeaver:in Silico Benchmark Generation and Performance Profiling of Network Inference Methods[J]. Bioinformatics, 2011, 27(16):2263-2270
[24] Cantone I, Marucci L, Iorio F, et al. A Yeast Synthetic Network for in Vivo Assessment of Reverse-Engineering and Modeling Approaches[J]. Cell, 2009, 137(1):172-181
[25] Shen-Orr S S, Milo R, Mangan S, et al. Network Motifs in the Transcriptional Regulation Network of Escherichia Coli[J]. Nature Genetics, 2002, 31(1):64-68
[26] Wang Y, Joshi T, Zhang X S, et al. Inferring Gene Regulatory Networks from Multiple Microarray Datasets[J]. Bioinformatics, 2006, 22(19):2413-2420
[27] Kalisch M, Bühlmann P. Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm[J]. Journal of Machine Learning Research, 2007, 8:613-636