论文:2021,Vol:39,Issue(4):909-918
引用本文:
张晨煜, 刘文洁, 庞天泽, 岳艳涛. 基于分布式数据库的相关子查询优化[J]. 西北工业大学学报
ZHANG Chenyu, LIU Wenjie, PANG Tianze, YUE Yantao. Optimization of correlate subquery based on distributed database[J]. Northwestern polytechnical university

基于分布式数据库的相关子查询优化
张晨煜1, 刘文洁1, 庞天泽2, 岳艳涛2
1. 西北工业大学 计算机学院, 陕西 西安 710129;
2. 交通银行, 上海 200120
摘要:
子查询在数据库中的应用较为广泛,根据是否与父查询的表有依赖关系,可以将其分为相关子查询和非相关子查询。相关子查询需要先从父查询中取出一个元组后执行子查询,即需要反复对子查询的内容进行运算。这种策略的磁盘访问开销很大,在分布式数据库中,由于存在数据通信开销,在父查询元组过多时效率较低。针对该类子查询,在现有的优化查询策略的基础上,结合分布式数据库的特点,提出了通过子查询上拉为连接查询,消除子查询中冗余子句,消除聚集函数等方法实现的基于分布式数据库的子查询优化策略,并通过实验验证了所提优化策略的有效性。
关键词:    分布式数据库    相关子查询优化    基于规则的优化   
Optimization of correlate subquery based on distributed database
ZHANG Chenyu1, LIU Wenjie1, PANG Tianze2, YUE Yantao2
1. School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China;
2. Bank of Communications, Shanghai 200120, China
Abstract:
Subquery is widely used in database. It can be divided into related subquery and non-related subquery according to whether it is dependent on the table of the parent query. For related subqueries, it is necessary to take a tuple from the parent query before executing the subquery, that is, the content of the subquery needs to be repeatedly operated. Disk access costs of this strategy is very big, in the distributed database, because of data communication overhead, in the parent query yuan set is too low efficiency, therefore, for the class sub queries, on the basis of the optimization of the existing query strategy, combining with the characteristics of distributed database, put forward by the subquery on to join queries, eliminate redundant clauses in the subquery, eliminate accumulation function method based on distributed database query optimization strategy, and the effectiveness of the present optimization strategy is verified by experiment.
Key words:    Distributed database    Correlate subquery optimization    rule-based optimization   
收稿日期: 2020-11-25     修回日期:
DOI: 10.1051/jnwpu/20213940909
基金项目: 国家自然科学基金面上项目(61672432)与国家自然科学基金重点项目61732014)资助
通讯作者: 刘文洁(1976-),女,西北工业大学副教授,主要从事云计算、大数据处理、海量分布式数据库研究。e-mail:liuwenjie@nwpu.edu.cn     Email:liuwenjie@nwpu.edu.cn
作者简介: 张晨煜(1997-),西北工业大学硕士研究生,主要从事分布式数据库研究。
相关功能
PDF(1452KB) Free
打印本文
把本文推荐给朋友
作者相关文章
张晨煜  在本刊中的所有文章
刘文洁  在本刊中的所有文章
庞天泽  在本刊中的所有文章
岳艳涛  在本刊中的所有文章

参考文献:
[1] 萨师煊, 王珊. 数据库系统概论[M]. 北京:高等教育出版社SA Shixuan, WANG Shan. Database system concepts[M]. Beijing:Higher Education Press (in Chinese)
[2] 刘文洁, 李戬勃, 李战怀. 一种面向金融应用的海量分布式关系数据库[J]. 华中科技大学学报, 2019,47(2):121-126 LIU Wenjie, LI Jianbo, LI Zhanhuai. A massive distributed relational database for financial application[J]. Journal of Huazhong University, 2019, 47(2):121-126(in Chinese)
[3] 高锦涛, 刘文洁, 李战怀. 一种面向分布式读写分离系统的数据同步策略[J]. 西北工业大学学报, 2020, 38(1):209-215 GAO Jintao, LIU Wenjie, LI Zhanhuai. A strategy of data synchronization in distributed system with read separating from write[J]. Journal of Northwestern Polytechnical University, 2020, 38(1):209-215(in Chinese)
[4] 高锦涛, 李战怀, 杜洪涛, 等. 分布式数据库下基于剪枝的并行合并连接策略[J]. 软件学报, 2019, 30(11):3364-3381 GAO Jintao, LI Zhanhuai, DU Hongtao, et al. Strategy of parallel merge join based on prune in distributed database[J]. Journal of Software, 2019, 30(11):3364-3381(in Chinese)
[5] BELLAMKONDA Srikanth, AHMED Rafi, ANDREW Witkowski, et al. Enhanced subquery optimizations in oracle[J]. Proc VLDB Endow, 2009, 2(2):1366-1377
[6] 毛思雨, 张利军, 张小芳, 等. 面向分布式数据库的相关子查询优化策略[J]. 华东师范大学学报,2016(5):57-66 MAO Siyu, ZHANG Lijun, ZHANG Xiaofang, et al. Optimization strategies of correlated subquery for distributed database[J]. Journal of East China Normal University, 2016(5):57-66(in Chinese)
[7] SHIOI Takamitsu, HATANO Kenji. Query processing optimization using disk-based row-store and column-store[J]. iiWAS, 2015, 69:1-9
[8] MOHAMMED H A, MURAT K. To review and compare evolutionary algorithms in optimization of distributed database query[C]//International Symposium on Digital Forensics and Security, 2020:1-5
[9] ESLAMI Mehrad, MAHMOODIAN Vahid, DAYARIAN Iman, et al. Query batching optimization in database systems[J]. Computers & Operations Research,2020,121:1-17
[10] JI Xuechun, ZHAO Maoxian, ZHAI Mingyu, et al. Query execution optimization in spark SQL[J]. Sci Program, 2020, 2020:6364752
[11] DING Bailu, CHAUDHURI Surajit, NARASAYYA Vivek R. Bitvector-aware query optimization for decision support queries[C]//SIGMOD Conference, 2020:2011-2026
[12] VAMSIKRISHNA Meduri Venkata, TAN Kian-Lee. Subquery plan reuse based query optimization[C]//International Conference on Management of Data, 2011:35-46
[13] 李海翔. 数据库查询优化器的艺术[M]. 北京:机械工业出版社, 2014 LI Haixiang. The art of database query optimizer principle and SQL performance optimization[M]. Beijing:China Machine Press, 2014(in Chinese)
[14] CHEN G, WU Y, LIU J, et al. Optimization of sub-query processing in distributed data integration systems[J]. Journal of Network and Computer Applications, 2011, 34(4):1035-1042