Cited time in webofscience Cited time in scopus

A large-scale graph processing method for multi-at-tribute graphs using GPUs and SSDs

Title
A large-scale graph processing method for multi-at-tribute graphs using GPUs and SSDs
Alternative Title
복수 개의 GPU와 SSD를 이용한 대규모의 다중 속성 그래프 처리 방법
Author(s)
An, Kyu Hyeon
DGIST Authors
An, Kyu HyeonKim, Min SooHwang, Dae Hee
Advisor
Kim, Min Soo
Co-Advisor(s)
Hwang, Dae Hee
Issued Date
2017
Awarded Date
2017. 2
Type
Thesis
Subject
Graph processingGPUsSSDsStreamParallel programming그래프 처리병렬 프로그래밍
Abstract
The one of characteristics is simpeness of structure that provides facile data analysis. The importance of graph processing is growing due to that. However the handling on large-scale graph using existing methods becomes harder due to the increase of data size. The distributed graph system requires a lot of machine for scalability. However network overhead increases linearly and is larger than processing time. Meanwhile GPU is used as coprocessor for handling graph data via thousands of GPU cores. However all of existing methods using GPU fail to process large-scale graphs due to lack of main memory.
In this paper, we propose the method GStream 2.0 that handles large-scale graphs which is larger than main memory using a single machine. Proposed method is streaming the graph data from secondary storage to GPU memory and launches graph algorithm on GPU device. Through streaming graph data via slotted page format which is partitioning graph data with fixed size, proposed method handles large-scale graph which is larger than GPU device memory. Performance is improved by avoiding intermediate data transfer and proposed method is using the array for result of algorithm which is sorted in GPU device memory. For supporting graph algorithm based on edge attribute, we extend slotted page format. Finally, we propose the extended model for 2-hop neighborhood algorithm.
Through experiments, we show that GStream 2.0 significantly outperforms the major methods and the state-of-the-art methods. ⓒ 2017 DGIST
Table Of Contents
Ⅰ. INTRODUCTION 1--
Ⅱ. PRELIMINARIES 5--
Ⅲ. MAIN CONCEPT OF GSTREAM 2.0 7--
3.1 Main concept of GStream 2.0 7--
3.2 Asynchronous multiple streaming 9--
3.3 Major types of graph algorithms 11--
3.4 Algorithm of the framework 12--
Ⅳ. THE STRATEGY OF GSTREAM 2.0 16--
4.1 Strategy for performance 16--
4.2 Strategy for scalability 18--
Ⅴ. EXTENDED SLOTTED PAGE FORMAT 21--
5.1 Extended slotted page format for trillion-scale graph 21--
5.2 Extended slotted page format for edge attributes 22--
Ⅵ. EXTENDED PROCESSING MODEL OF GSTREAM 2.0 24--
Ⅶ. EXPERIMENTAL EVALUATION 26--
7.1 Experimental setup 26--
7.2 Comparison with Distributed Methods 29--
7.3 Comparison with CPU-based Methods 29--
7.4 Comparison with GPU-based Methods 31--
7.5 Characteristics of GStream 2.0 33--
7.6 Additional graph algorithms 35--
7.7 Experiment for edge attribute 36--
Ⅷ. CONCLUSION 37
URI
http://dgist.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002324205

http://hdl.handle.net/20.500.11750/1504
DOI
10.22677/thesis.2324205
Degree
Master
Department
Information and Communication Engineering
Publisher
DGIST
Files in This Item:
000002324205.pdf

000002324205.pdf

기타 데이터 / 1.31 MB / Adobe PDF download
Appears in Collections:
Department of Electrical Engineering and Computer Science Theses Master

qrcode

  • twitter
  • facebook
  • mendeley

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE