Detail View

Balancing Computation Loads and Optimizing Input Vector Loading in LSTM Accelerators
Citations

WEB OF SCIENCE

Citations

SCOPUS

Metadata Downloads

DC Field Value Language
dc.contributor.author Park, Junki -
dc.contributor.author Yi, Wooseok -
dc.contributor.author Ahn, Daehyun -
dc.contributor.author Kung, Jaeha -
dc.contributor.author Kim, Jae-Joon -
dc.date.accessioned 2021-01-13T05:27:38Z -
dc.date.available 2021-01-13T05:27:38Z -
dc.date.created 2020-09-16 -
dc.date.issued 2020-09 -
dc.identifier.issn 0278-0070 -
dc.identifier.uri http://hdl.handle.net/20.500.11750/12562 -
dc.description.abstract The long short-term memory (LSTM) is a widely used neural network model for dealing with time-varying data. To reduce the memory requirement, pruning is often applied to the weight matrix of the LSTM, which makes the matrix sparse. In this paper, we present a new sparse matrix format, named rearranged compressed sparse column (RCSC), to maximize the inference speed of the LSTM hardware accelerator. The RCSC format speeds up the inference by: 1) evenly distributing the computation loads to processing elements (PEs) and 2) reducing the input vector load miss within the local buffer. We also propose a hardware architecture adopting hierarchical input buffer to further reduce the pipeline stalls which cannot be handled by the RCSC format alone. The simulation results for various datasets show that combined use of the RSCS format and the proposed hardware requires 2x smaller inference runtime on average compared to the previous work. -
dc.language English -
dc.publisher Institute of Electrical and Electronics Engineers -
dc.title Balancing Computation Loads and Optimizing Input Vector Loading in LSTM Accelerators -
dc.type Article -
dc.identifier.doi 10.1109/TCAD.2019.2926482 -
dc.identifier.wosid 000562034400012 -
dc.identifier.scopusid 2-s2.0-85086447025 -
dc.identifier.bibliographicCitation Park, Junki. (2020-09). Balancing Computation Loads and Optimizing Input Vector Loading in LSTM Accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(9), 1889–1901. doi: 10.1109/TCAD.2019.2926482 -
dc.description.isOpenAccess FALSE -
dc.subject.keywordAuthor Sparse matrices -
dc.subject.keywordAuthor Logic gates -
dc.subject.keywordAuthor Hardware -
dc.subject.keywordAuthor Computer architecture -
dc.subject.keywordAuthor Clocks -
dc.subject.keywordAuthor History -
dc.subject.keywordAuthor Standards -
dc.subject.keywordAuthor Accelerators -
dc.subject.keywordAuthor computer architecture -
dc.subject.keywordAuthor hardware -
dc.subject.keywordAuthor machine learning -
dc.subject.keywordAuthor recurrent neural networks (RNNs) -
dc.citation.endPage 1901 -
dc.citation.number 9 -
dc.citation.startPage 1889 -
dc.citation.title IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems -
dc.citation.volume 39 -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.relation.journalResearchArea Computer Science; Engineering -
dc.relation.journalWebOfScienceCategory Computer Science, Hardware & Architecture; Computer Science, Interdisciplinary Applications; Engineering, Electrical & Electronic -
dc.type.docType Article -
Show Simple Item Record

File Downloads

  • There are no files associated with this item.

공유

qrcode
공유하기

Related Researcher

궁재하
Kung, Jaeha궁재하

Department of Electrical Engineering and Computer Science

read more

Total Views & Downloads