论文:2020,Vol:38,Issue(3):589-595
引用本文:
汤小春, 符莹, 樊雪枫. 数据中心上异构资源的细粒度分配算法研究[J]. 西北工业大学学报
TANG Xiaochun, FU Ying, FAN Xuefeng. Fine-Grained Allocation Algorithm for Sharing Heterogeneous Resources in Data Center[J]. Northwestern polytechnical university

数据中心上异构资源的细粒度分配算法研究
汤小春, 符莹, 樊雪枫
西北工业大学 计算机学院, 陕西 西安 710072
摘要:
数据中心的出现,使得大数据分析任务被分散到不同的计算节点。随着GPU计算的广泛应用,如何为不同的计算框架合理分配异构计算资源是目前的研究热点。研究了传统大数据计算框架和GPU计算的特点,针对现有的集群资源管理和GPU管理模式,提出了一种集中式异构资源管理模型,计算节点负责本地资源管理和任务的执行和管理,资源管理中心统一管理各个计算框架。对于不同的计算框架,根据其使用CPU以及GPU资源的不同,设计并实现了一种混合主资源共享分配算法,通过计算不同框架对主资源的使用,优先从可用资源中为主资源使用率最小的框架分配资源,实现主资源在各个框架的公平共享,防止CPU任务过多而导致GPU资源"饥饿",或者反过来导致CPU资源"饥饿"的现象发生。通过实验验证,该分配算法在异构资源使用效率以及任务完成数量方面能提高15%左右。
关键词:    混合主资源    异构集群    资源共享    数据中心   
Fine-Grained Allocation Algorithm for Sharing Heterogeneous Resources in Data Center
TANG Xiaochun, FU Ying, FAN Xuefeng
School of Computer Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
Abstract:
Data in a data center are stored dispersively. The data-oriented task computing disperses big data analysis tasks to different computing nodes. The extensive use of graphics processing unit (GPU) makes it urgent and important to study how to reasonably assign heterogeneous resources to different computing frameworks. We investigate the existing big data computing framework and the GPU computing. Based on the existing cluster resource management model and the GPU management model, we propose a hybrid heterogeneous resource management model that combines CPU resources with GPU resources. The computing nodes manage local resources and implement tasks; the resource management center concertedly manage various computing frameworks. We design and implement a hybrid domain resource sharing and allocation algorithm, which allocates the hybrid domain resources to computing frameworks according to the coordinated use of them so as to fairly share the hybrid domain resources among various computing frameworks and prevent the CPU from too many tasks but the GPU or CPU from resource "hunger". The experimental results show that the allocation algorithm can increase the use of heterogeneous resources and the number of completed tasks by around 15%.
Key words:    hybrid domain resource    allocation algorithm    heterogeneous resource    resource sharing    data center   
收稿日期: 2019-09-11     修回日期:
DOI: 10.1051/jnwpu/20203830589
基金项目: 科技部重点研发基金(2018YFB1003403)与西北工业大学硕士研究生创意创新种子基金(ZZ2019204)资助
通讯作者:     Email:
作者简介: 汤小春(1969-),西北工业大学副教授,主要从事大数据处理、云计算研究。
相关功能
PDF(1567KB) Free
打印本文
把本文推荐给朋友
作者相关文章
汤小春  在本刊中的所有文章
符莹  在本刊中的所有文章
樊雪枫  在本刊中的所有文章

参考文献:
[1] DEAN J, GHEMAWAT S. MapReduce:Simplified Data Processing on Large Clusters[J]. Communication of the ACM, 2008, 51(1):107-113
[2] ISARD M, BUDIU M, YU Y, et al. Dryad:Distributed Data-Parallel Programs Form Sequential Building[C]//Proceedings of the 2nd ACM SIGOPS/Euro Sys European Conference on Computer Systenm, 2007:59-72
[3] ZAHARIA M, N. CHOWDHURY N M M, FRANKLIN M, et al. Spark:Cluster Computing with Working Sets[R]. Technical Report UCB/EECS-2010-53, 2010
[4] PARIS Carbone, ASTERIOS Katsifodimos, STEPHAN Ewen, et al. Apache Flink:Stream and Batch Processing in a Single Engine[J]. IEEE Data Engineering Bulletin, 2015, 38(4):28-38
[5] HADOOP Project. Hadoop Capacity Scheduler[EB/OL].(2019-08-23)[2019-08-25]. http://hadoop.apache.org/common/docs/r1.2.9/capacity_scheduler.html
[6] HINDMAN B, KONWINSKI A, ZAHARIA M, et al. Mesos:a Platform for Fine-Grained Resource Sharing in the Data Center[C]//Implementation Berkety, CA:USENIX Association, 2011:22-35
[7] VERMA A, PEDROSA L, KORUPOLU M, et al. Large-Scale Cluster Management at Google with Borg[C]//Tenth European Conference on Computer Systems, New York, 2015:18-34
[8] SCHWARZKOPF M, KONWINSKI A, ABD-EL-MALEK M, et al. Omega:Flexible, Scalable Schedulers for Large Compute Clusters[C]//Proc of 8th ACM European Conference on Computer Systems, New York, 2013:351-364
[9] KARANASOS K, RAO S, CURINO C, et al. Mercury:Hybrid Centralized and Distributed Scheduling in Large Shared Clusters[C]//2015 USENIX Annual Technical Conference, Berkeley, 2015:485-497
[10] ACURA P. Deploying Rails with Docker, Kubernetes and ECS[M]. Berkeley, Apress, 2016
[11] FUKUTOMI D, IIDA Y, AZUMI T, et al. GPUhd:Augmenting YARN with GPU Resource Management[C]//International Conference on High Performance Computing in Asia-Pacific Region, 2018
[12] MYEONGJAE J, SHIVARAM V, AMAR P, et al. Multi-Tenant GPU Clusters for Deep Learning Workloads:Analysis and Implications[EB/OL].(2018-05-13)[2019-08-10]. http://www.microsoft.com/en-us/research/publication/multi-tenant-gpu-clusters-deep-learning-workloads-analysis-implications/
[13] VOLODYMYR V, KINDRATENK O, JEREMY J, et al. GPU Clusters for High-Performance Computing[C]//2009 IEEE International Conference on Cluster Computing and Workshops, New Orleans, USA, 2009:1-8