Cited 0 time in webofscience Cited 0 time in scopus

A large-scale graph processing method for multi-at-tribute graphs using GPUs and SSDs

A large-scale graph processing method for multi-at-tribute graphs using GPUs and SSDs
Translated Title
복수 개의 GPU와 SSD를 이용한 대규모의 다중 속성 그래프 처리 방법
An, Kyu Hyeon
DGIST Authors
An, Kyu Hyeon; Kim, Min SooHwang, Dae Hee
Kim, Min Soo
Hwang, Dae Hee
Issue Date
Available Date
Degree Date
2017. 2
Graph processingGPUsSSDsStreamParallel programming그래프 처리병렬 프로그래밍
The one of characteristics is simpeness of structure that provides facile data analysis. The importance of graph processing is growing due to that. However the handling on large-scale graph using existing methods becomes harder due to the increase of data size. The distributed graph system requires a lot of machine for scalability. However network overhead increases linearly and is larger than processing time. Meanwhile GPU is used as coprocessor for handling graph data via thousands of GPU cores. However all of existing methods using GPU fail to process large-scale graphs due to lack of main memory. In this paper, we propose the method GStream 2.0 that handles large-scale graphs which is larger than main memory using a single machine. Proposed method is streaming the graph data from secondary storage to GPU memory and launches graph algorithm on GPU device. Through streaming graph data via slotted page format which is partitioning graph data with fixed size, proposed method handles large-scale graph which is larger than GPU device memory. Performance is improved by avoiding intermediate data transfer and proposed method is using the array for result of algorithm which is sorted in GPU device memory. For supporting graph algorithm based on edge attribute, we extend slotted page format. Finally, we propose the extended model for 2-hop neighborhood algorithm. Through experiments, we show that GStream 2.0 significantly outperforms the major methods and the state-of-the-art methods. ⓒ 2017 DGIST
Table Of Contents
Ⅰ. INTRODUCTION 1-- Ⅱ. PRELIMINARIES 5-- Ⅲ. MAIN CONCEPT OF GSTREAM 2.0 7-- 3.1 Main concept of GStream 2.0 7-- 3.2 Asynchronous multiple streaming 9-- 3.3 Major types of graph algorithms 11-- 3.4 Algorithm of the framework 12-- Ⅳ. THE STRATEGY OF GSTREAM 2.0 16-- 4.1 Strategy for performance 16-- 4.2 Strategy for scalability 18-- Ⅴ. EXTENDED SLOTTED PAGE FORMAT 21-- 5.1 Extended slotted page format for trillion-scale graph 21-- 5.2 Extended slotted page format for edge attributes 22-- Ⅵ. EXTENDED PROCESSING MODEL OF GSTREAM 2.0 24-- Ⅶ. EXPERIMENTAL EVALUATION 26-- 7.1 Experimental setup 26-- 7.2 Comparison with Distributed Methods 29-- 7.3 Comparison with CPU-based Methods 29-- 7.4 Comparison with GPU-based Methods 31-- 7.5 Characteristics of GStream 2.0 33-- 7.6 Additional graph algorithms 35-- 7.7 Experiment for edge attribute 36-- Ⅷ. CONCLUSION 37
Information and Communication Engineering
Related Researcher
  • Author Kim, Min-Soo InfoLab
  • Research Interests Big Data Systems; Big Data Mining & Machine Learning; Big Data Bioinformatics; 데이터 마이닝 및 빅데이터 분석; 바이오인포메틱스 및 뉴로인포메틱스; 뇌-기계 인터페이스(BMI)
Department of Information and Communication EngineeringThesesMaster

qrcode mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.