论文:2017,Vol:35,Issue(6):1047-1053
引用本文:
邓志龙, 张琦玮, 曹皓, 谷志阳. 一种基于深度强化学习的调度优化方法[J]. 西北工业大学学报
Deng Zhilong, Zhang Qiwei, Cao Hao, Gu Zhiyang. A Scheduling Optimization Method Based on Depth Intensive Study[J]. Northwestern polytechnical university

一种基于深度强化学习的调度优化方法
邓志龙1, 张琦玮2, 曹皓2, 谷志阳2
1. 西北工业大学 电子信息学院, 陕西 西安 710072;
2. 西北工业大学 自动化学院, 陕西 西安 710072
摘要:
深度强化学习在于将深度学习的感知能力与强化学习的决策能力相结合,可以直接根据输入进行控制,是一种更接近人类思维方式的人工智能方法。旨在二者结合基础上,研究了一种基于深度强化学习的资源调度算法的设计框架。该框架首先利用从网络节点获取的大量先验数据,训练深度学习网络;然后利用强化学习来分配网络资源;接着通过大量的自我对弈,实现基于深度强化学习的价值网络学习。最后,设计实验方案对算法的性能进行了仿真和对比验证,以验证该算法的有效性。
关键词:    深度学习    调度算法    蒙特卡洛模拟    强化学习   
A Scheduling Optimization Method Based on Depth Intensive Study
Deng Zhilong1, Zhang Qiwei2, Cao Hao2, Gu Zhiyang2
1. School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China;
2. School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
Abstract:
Depth intensive study is a combination of deep learning perceived ability and enhanced learning decision-making ability which can be controlled by the input. Depth intensive study is an artificial intelligence method which is closer to human thinking. Based on the combination of the two methods, the paper studies a designed framework of resource scheduling algorithm based on depth intensive study. First, the framework utilizes a large number of priori data from the network nodes to train depth learning network. Then use the enhanced learning to allocate network resources, Next realize the value of network learning based on deep reinforcement learning through a lot of self-chess. Finally, the performance of the algorithm is simulated and compared, and the results confirm the effectiveness of the algorithm.
Key words:    deep learning    scheduling algorithms    Monte Carlo simulation    reinforcement learning   
收稿日期: 2017-02-01     修回日期:
DOI:
基金项目: 国家自然科学基金(U1609216)资助
通讯作者:     Email:
作者简介: 邓志龙(1976-),西北工业大学博士研究生,主要从事网络大数据及数据挖掘研究。
相关功能
PDF(1324KB) Free
打印本文
把本文推荐给朋友
作者相关文章
邓志龙  在本刊中的所有文章
张琦玮  在本刊中的所有文章
曹皓  在本刊中的所有文章
谷志阳  在本刊中的所有文章

参考文献:
[1] Deng Zhenghong, Ma Chunmiao, Mao Xudong. Historical Payoff Promotes Cooperation in the Prisoner's Dilemma Game[J]. Chaos, Solitons & Fractals, 2017, 104:1-5
[2] Gao Bo, Deng Zhenghong, Zhao Dawei. Competing Spreading Processes and Immunization in Multiplex Networks[J]. Chaos Solitons & Fractals, 2016, 93:175-181
[3] Xiao Luxin. Research on the Optimization of Enrollment Data Resources Based on Cloud Computing Platform[J]. International Information and Engineering Technology Association, 2015, 2(2):9-12
[4] 尹宝才, 王文通,王立春. 深度学习研究综述[J]. 北京工业大学学报, 2015(1):48-59 Yin Baocai, Wang Wentong, Wang Lichun. Review of Deep Learning[J]. Journal of Beijing University of Technology, 2015(1):48-59(in Chinese)
[5] 刘建伟, 刘媛,罗雄麟. 深度学习研究进展[J]. 计算机应用研究, 2014(7):1921-1930 Liu Jianwei, Liu Yuan, Luo Xionglin. Research and Development on Deep Learning[J]. Application Research of Cernputers, 2014(7):1921-1930(in Chinese)
[6] 张浩, 吴秀娟, 王静. 深度学习的目标与评价体系构建[J]. 中国电化教育, 2014(7):51-55 Zhang Hao, Wu Xiujuan, Wang Jing. Study on the Evaluation Theoretical Structure Building of Deep Learning[J]. China Educational Technology, 2014(7):51-55(in Chinese)
[7] 邓正宏,薛静. 基于数量化Ⅱ类的数据分析库的设计与实现[J]. 计算机工程与应用, 2003, 39(28):42-45 Deng Zhengheng, Xue Jing. The Design and Implementation of QuantityⅡ Class Based Analysis Library System[J]. Computer Engineering and Applications, 2003, 39(28):42-45(in Chinese)