Sparse matrices are widely used to analyze a complex system which requires lots of linear algebraic operations such as computer graphics, recommender systems, machine learning, and information retrieval. As the size of real graphs are increasing rapidly, fast and scalable methods for handling such large-scale sparse matrices have become harder than before since the size of matrices increases as well. There have been various studies on implementing GPU-based methods for SpGEMM. However, most of those methods have faced two major challenges. First, the irregularity of matrices causes poor load balancing among threads. Second, many matrices do not fit in the GPU device memory.We propose a scalable and efficient GPU-based sparse-sparse matrices multiplication method called MStream which exploits streaming technology, which could hide memory latency of copy between main memory and device memory. To fully exploit GPU streams technology, we explore the design choice of an efficient data structure called slotted page format which divides a matrix into small fixed-size units instead of performing on Compressed Sparse Row (CSR) or Compressed Sparse Column (CSC) formats which are widely used to represent graphs with a sparse matrix in memory. ⓒ 2017 DGIST
Table Of Contents
Ⅰ. INTRODUCTION 1 -- II. PRELIMINARIES . 5 -- 2.1 Many-core GPU 5 -- 2.2 Sparse Format for Matrix 6 -- 2.3 Slotted Page for Representing Sparse Matrix 6 -- 2.4 Two general schemes for matrix multiplication 8 -- 2.5 Observations 10 -- III. MSTREAM METHOD . 14 -- 3.1 Analysis of Algorithms for MStream 14 -- 3.2 Streaming Technique using GPU streams 20 -- 3.3 Overall Framework of MStream 22 -- IV. EXPERIMENTS . 25 -- 4.1 Experimental setup 25 -- 4.2 Comparison with cuSparse 26 -- 4.3 Characteristic of MStream . 27 -- V. CONCLUSIONS 29