|
|
论文:2021,Vol:39,Issue(2):430-438 |
|
|
引用本文: |
|
|
景苌弘, 刘文洁, 高锦涛, 裴欧亚. 面向分布式数据库的HTAP研究与实现[J]. 西北工业大学学报 |
|
|
JING Changhong, LIU Wenjie, GAO Jintao, PEI Ouya. Research and implementation of HTAP for distributed database[J]. Northwestern polytechnical university |
|
|
|
|
|
|
|
面向分布式数据库的HTAP研究与实现 |
|
景苌弘, 刘文洁, 高锦涛, 裴欧亚 |
|
西北工业大学 计算机学院, 陕西 西安 710072 |
摘要: |
数据处理可大致分为2类,联机事务处理OLTP(on-line transaction processing)和联机分析处理OLAP(on-line analytical processing)。OLTP是传统关系型数据库的主要应用,支持一些基本的日常的事务处理,如银行流水交易等。OLAP是数据仓库系统的主要应用,支持一些较为复杂的数据分析操作,专注于决策支持,提供出通俗直观的分析结果。随着企业处理数据量的不断增加,分布式数据库已经逐渐取代单机数据库,成为应用的主流。但目前分布式数据库支持的业务主要以OLTP应用为主,缺少OLAP实现。提出了一种面向分布式数据库CBase的HTAP的实现方法,为CBase提供了一种OLAP分析的实现方式,可以轻松应对大数据量的数据分析。 |
关键词:
分布式数据库
HTAP
OLAP
数据分析
|
|
Research and implementation of HTAP for distributed database |
|
JING Changhong, LIU Wenjie, GAO Jintao, PEI Ouya |
|
School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China |
Abstract: |
Data processing can be roughly divided into two categories, online transaction processing OLTP(on-line transaction processing) and online analytical processing OLAP(on-line analytical processing). OLTP is the main application of traditional relational databases, and it is some basic daily transaction processing, such as bank pipeline transactions and so on. OLAP is the main application of the data warehouse system, it supports some more complex data analysis operations, focuses on decision support, and provides popular and intuitive analysis results. As the amount of data processed by enterprises continues to increase, distributed databases have gradually replaced stand-alone databases and become the mainstream of applications. However, the current business supported by distributed databases is mainly based on OLTP applications, lacking OLAP implementation. This paper proposes an implementation method of HTAP for distributed database CBase, which provides an implementation method of OLAP analysis for CBase, and can easily deal with data analysis of large amounts of data. |
Key words:
distributed database
HTAP
OLAP
data analysis
|
|
收稿日期: 2020-07-17
修回日期:
|
DOI: 10.1051/jnwpu/20213920430 |
基金项目: 国家自然科学基金(61672432,61732014)资助 |
通讯作者: 刘文洁(1976-),女,西北工业大学副教授,主要从事云计算、大数据处理、海量式分布数据率研究。e-mail:liuwenjie@nwpu.edu.cn
Email:liuwenjie@nwpu.edu.cn |
作者简介: 景苌弘(1997-),西北工业大学硕士研究生,主要从事分布式数据库研究。
|
|
|
|
|
|
|
|
参考文献: |
|
|
[1] 刘文洁,李戬勃,李战怀, 等. 一种面向金融应用的海量分布式关系数据库[J]. 华中科技大学学报, 2019,47(2):121-126 LIU Wenjie, LI Jianbo, LI Zhanhuai, et al. A massire distribnted relational database for financial application[J]. Journal of Huazhong University, 2019, 47(2):121-126(in Chinese) [2] DARKO Makreshanski, JANA Giceva, CLAUDE Barthels, et al. BatchDB:efficient isolated execution of hybrid OLTP+OLAP workloads for interactive applications[C]//Proceedings of the 2017 ACM International Conference on Management of Data, New York, 2017 [3] SADOGHI M, BHATTACHERJEE S, BHATTACHARJEE B, et al. L-store:a real-time OLTP and OLAP system[C]//Proceedings of the 21st International Confcrence on Extendirg Database Technology, 2018 [4] ZHANG K, SADOGHI M, JACOBSEN H. DL-store:a distributed hybrid OLTP and OLAP data processing engine[C]//IEEE 36th International Conference on Distributed Computing Systems, 2016:769-770 [5] RONALD Barber, VIJAYSHANKAR Raman, RICHARD Sidlc, et al. Wildfire:HTAP for big data[M]. Encyclopedia of Big Data Technologies, 2019 [6] BOISSIER M, SCHLOSSER R, UFLACKER M. Hybrid data layouts for tiered HTAP databases with pareto-optimal data placements[C]//IEEE 34th International Conference on Data Engineering, 2018 [7] CHAALAL Hichem, TRAVERS Nicolas, BELBACHIR Hafida. T-plotter:a new data structure to reconcile OLAP and OLTP Models[J]. Multiagent and Grid Systems, 2019, 1:237-257 [8] HEDJAZI M A, KOURBANE I, GENC Y, et al. A comparison of hadoop, spark and storm for the task of large scale image classification[C]//2018 26th Signal Processing and Communications Applications Conference, Izmir, 2018 [9] MARKL V. Mosaics:stratosphere, flink and beyond[C]//IEEE International Conference on Data Engineering, 2017 [10] KAMIL Jerábek, ONDREJ Rysary. Big data network flow processing using apache spark[C]//Proceedings of the 6th Conference on the Engineering of Computer Based Systems, 2019 [11] CHINTAPALLI S, DAGIT D, EVANS B, et al. Benchmarking streaming computation engines:storm, flink and spark streaming[C]//IEEE International Parallel & Distributed Processing Symposium Workshops, 2016 |
|
|
|
|
|
|
|