论文:2021,Vol:39,Issue(2):430-438
引用本文:
景苌弘, 刘文洁, 高锦涛, 裴欧亚. 面向分布式数据库的HTAP研究与实现[J]. 西北工业大学学报
JING Changhong, LIU Wenjie, GAO Jintao, PEI Ouya. Research and implementation of HTAP for distributed database[J]. Northwestern polytechnical university

面向分布式数据库的HTAP研究与实现
景苌弘, 刘文洁, 高锦涛, 裴欧亚
西北工业大学 计算机学院, 陕西 西安 710072
摘要:
数据处理可大致分为2类,联机事务处理OLTP(on-line transaction processing)和联机分析处理OLAP(on-line analytical processing)。OLTP是传统关系型数据库的主要应用,支持一些基本的日常的事务处理,如银行流水交易等。OLAP是数据仓库系统的主要应用,支持一些较为复杂的数据分析操作,专注于决策支持,提供出通俗直观的分析结果。随着企业处理数据量的不断增加,分布式数据库已经逐渐取代单机数据库,成为应用的主流。但目前分布式数据库支持的业务主要以OLTP应用为主,缺少OLAP实现。提出了一种面向分布式数据库CBase的HTAP的实现方法,为CBase提供了一种OLAP分析的实现方式,可以轻松应对大数据量的数据分析。
关键词:    分布式数据库    HTAP    OLAP    数据分析   
Research and implementation of HTAP for distributed database
JING Changhong, LIU Wenjie, GAO Jintao, PEI Ouya
School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
Abstract:
Data processing can be roughly divided into two categories, online transaction processing OLTP(on-line transaction processing) and online analytical processing OLAP(on-line analytical processing). OLTP is the main application of traditional relational databases, and it is some basic daily transaction processing, such as bank pipeline transactions and so on. OLAP is the main application of the data warehouse system, it supports some more complex data analysis operations, focuses on decision support, and provides popular and intuitive analysis results. As the amount of data processed by enterprises continues to increase, distributed databases have gradually replaced stand-alone databases and become the mainstream of applications. However, the current business supported by distributed databases is mainly based on OLTP applications, lacking OLAP implementation. This paper proposes an implementation method of HTAP for distributed database CBase, which provides an implementation method of OLAP analysis for CBase, and can easily deal with data analysis of large amounts of data.
Key words:    distributed database    HTAP    OLAP    data analysis   
收稿日期: 2020-07-17     修回日期:
DOI: 10.1051/jnwpu/20213920430
基金项目: 国家自然科学基金(61672432,61732014)资助
通讯作者: 刘文洁(1976-),女,西北工业大学副教授,主要从事云计算、大数据处理、海量式分布数据率研究。e-mail:liuwenjie@nwpu.edu.cn     Email:liuwenjie@nwpu.edu.cn
作者简介: 景苌弘(1997-),西北工业大学硕士研究生,主要从事分布式数据库研究。
相关功能
PDF(1547KB) Free
打印本文
把本文推荐给朋友
作者相关文章
景苌弘  在本刊中的所有文章
刘文洁  在本刊中的所有文章
高锦涛  在本刊中的所有文章
裴欧亚  在本刊中的所有文章

参考文献:
[1] 刘文洁,李戬勃,李战怀, 等. 一种面向金融应用的海量分布式关系数据库[J]. 华中科技大学学报, 2019,47(2):121-126 LIU Wenjie, LI Jianbo, LI Zhanhuai, et al. A massire distribnted relational database for financial application[J]. Journal of Huazhong University, 2019, 47(2):121-126(in Chinese)
[2] DARKO Makreshanski, JANA Giceva, CLAUDE Barthels, et al. BatchDB:efficient isolated execution of hybrid OLTP+OLAP workloads for interactive applications[C]//Proceedings of the 2017 ACM International Conference on Management of Data, New York, 2017
[3] SADOGHI M, BHATTACHERJEE S, BHATTACHARJEE B, et al. L-store:a real-time OLTP and OLAP system[C]//Proceedings of the 21st International Confcrence on Extendirg Database Technology, 2018
[4] ZHANG K, SADOGHI M, JACOBSEN H. DL-store:a distributed hybrid OLTP and OLAP data processing engine[C]//IEEE 36th International Conference on Distributed Computing Systems, 2016:769-770
[5] RONALD Barber, VIJAYSHANKAR Raman, RICHARD Sidlc, et al. Wildfire:HTAP for big data[M]. Encyclopedia of Big Data Technologies, 2019
[6] BOISSIER M, SCHLOSSER R, UFLACKER M. Hybrid data layouts for tiered HTAP databases with pareto-optimal data placements[C]//IEEE 34th International Conference on Data Engineering, 2018
[7] CHAALAL Hichem, TRAVERS Nicolas, BELBACHIR Hafida. T-plotter:a new data structure to reconcile OLAP and OLTP Models[J]. Multiagent and Grid Systems, 2019, 1:237-257
[8] HEDJAZI M A, KOURBANE I, GENC Y, et al. A comparison of hadoop, spark and storm for the task of large scale image classification[C]//2018 26th Signal Processing and Communications Applications Conference, Izmir, 2018
[9] MARKL V. Mosaics:stratosphere, flink and beyond[C]//IEEE International Conference on Data Engineering, 2017
[10] KAMIL Jerábek, ONDREJ Rysary. Big data network flow processing using apache spark[C]//Proceedings of the 6th Conference on the Engineering of Computer Based Systems, 2019
[11] CHINTAPALLI S, DAGIT D, EVANS B, et al. Benchmarking streaming computation engines:storm, flink and spark streaming[C]//IEEE International Parallel & Distributed Processing Symposium Workshops, 2016