Cited 0 time in webofscience Cited 0 time in scopus

A parallel query processing system based on graph-based database partitioning

Title
A parallel query processing system based on graph-based database partitioning
Authors
Nam, Yoon-MinHan, DonghyoungKim, Min-Soo
DGIST Authors
Kim, Min-Soo
Issue Date
2019-04
Citation
Information Sciences, 480, 237-260
Type
Article
Article Type
Article
Author Keyword
Graph-based partitioning; Horizontal database partitioning; Parallel query processing
Keyword
Benchmarking; Database systems; Digital storage; Graphic methods; Redundancy; Trees (mathematics); Database partitioning; Graph-based; Large amounts of data; Parallel database systems; Parallel query processing; Partitioning methods; Query performance; State-of-the-art methods; Query processing
ISSN
0020-0255
Abstract
As parallel database systems have large amounts of data to process, it is important to utilize a scalable and efficient horizontal database partitioning method. The existing partitioning methods have major drawbacks that not only cause large amounts of data redundancy but also still require expensive shuffle operations for join queries in many cases—despite their high data redundancy. We elucidate upon the drawbacks originating from the tree-based partitioning schemes and propose a novel graph-based database partitioning method called GPT that both improves the query performance and reduces data redundancy. We integrate the proposed GPT method into a parallel query processing system, Spark SQL, across all the relevant layers and modules, including the query plan generator and the scan operator. Through extensive experiments using three benchmarks, TPC-DS, IMDB and BioWarehouse, we show that GPT significantly outperforms the state-of-the-art method in terms of both storage overhead and query performance. © 2018 Elsevier Inc.
URI
http://hdl.handle.net/20.500.11750/9526
DOI
10.1016/j.ins.2018.12.031
Publisher
Elsevier BV
Related Researcher
  • Author Kim, Min-Soo InfoLab
  • Research Interests Big Data Systems; Big Data Mining & Machine Learning; Big Data Bioinformatics; 데이터 마이닝 및 빅데이터 분석; 바이오인포메틱스 및 뉴로인포메틱스; 뇌-기계 인터페이스(BMI)
Files:
There are no files associated with this item.
Collection:
Department of Information and Communication EngineeringInfoLab1. Journal Articles


qrcode mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE