DGIST Scholar: Improving Hardware Efficiency of a Sparse Training Accelerator by Restructuring a Reduction Network

ETC 2. Conference Papers

Cited time in webofscience

Cited time in scopus

Improving Hardware Efficiency of a Sparse Training Accelerator by Restructuring a Reduction Network

Title: Improving Hardware Efficiency of a Sparse Training Accelerator by Restructuring a Reduction Network

Abstract: Deep learning is used in various applications including recommendation system, natural language processing, and image processing. When training deep learning models, there is inherent sparsity in input or weight matrices. Therefore, efficiently processing sparse general matrix multiplication (spGEMM) is the key to improve training efficiency. In this paper, we present an improved spGEMM accelerator by restructuring a reduction network and using a proper matrix tiling strategy. The overall hardware area is reduced by 21.8% and the power consumption is reduced by 37.5% with the proposed reduction network. When the stationary matrix's sparsity is 80% and the streaming matrix's sparsity is 99%, the required clock cycles reduce by 80% using the proposed matrix tiling method. © 2023 IEEE.