Cited time in webofscience Cited time in scopus

Full metadata record

DC Field Value Language
dc.contributor.author Park, Junki -
dc.contributor.author Yi, Wooseok -
dc.contributor.author Ahn, Daehyun -
dc.contributor.author Kung, Jaeha -
dc.contributor.author Kim, Jae-Joon -
dc.date.accessioned 2021-01-13T05:27:38Z -
dc.date.available 2021-01-13T05:27:38Z -
dc.date.created 2020-09-16 -
dc.date.issued 2020-09 -
dc.identifier.citation IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, v.39, no.9, pp.1889 - 1901 -
dc.identifier.issn 0278-0070 -
dc.identifier.uri http://hdl.handle.net/20.500.11750/12562 -
dc.description.abstract The long short-term memory (LSTM) is a widely used neural network model for dealing with time-varying data. To reduce the memory requirement, pruning is often applied to the weight matrix of the LSTM, which makes the matrix sparse. In this paper, we present a new sparse matrix format, named rearranged compressed sparse column (RCSC), to maximize the inference speed of the LSTM hardware accelerator. The RCSC format speeds up the inference by: 1) evenly distributing the computation loads to processing elements (PEs) and 2) reducing the input vector load miss within the local buffer. We also propose a hardware architecture adopting hierarchical input buffer to further reduce the pipeline stalls which cannot be handled by the RCSC format alone. The simulation results for various datasets show that combined use of the RSCS format and the proposed hardware requires 2x smaller inference runtime on average compared to the previous work. -
dc.language English -
dc.publisher Institute of Electrical and Electronics Engineers -
dc.title Balancing Computation Loads and Optimizing Input Vector Loading in LSTM Accelerators -
dc.type Article -
dc.identifier.doi 10.1109/TCAD.2019.2926482 -
dc.identifier.wosid 000562034400012 -
dc.identifier.scopusid 2-s2.0-85086447025 -
dc.type.local Article(Overseas) -
dc.type.rims ART -
dc.description.journalClass 1 -
dc.citation.publicationname IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems -
dc.contributor.nonIdAuthor Park, Junki -
dc.contributor.nonIdAuthor Yi, Wooseok -
dc.contributor.nonIdAuthor Ahn, Daehyun -
dc.contributor.nonIdAuthor Kim, Jae-Joon -
dc.identifier.citationVolume 39 -
dc.identifier.citationNumber 9 -
dc.identifier.citationStartPage 1889 -
dc.identifier.citationEndPage 1901 -
dc.identifier.citationTitle IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems -
dc.type.journalArticle Article -
dc.description.isOpenAccess N -
dc.subject.keywordAuthor Sparse matrices -
dc.subject.keywordAuthor Logic gates -
dc.subject.keywordAuthor Hardware -
dc.subject.keywordAuthor Computer architecture -
dc.subject.keywordAuthor Clocks -
dc.subject.keywordAuthor History -
dc.subject.keywordAuthor Standards -
dc.subject.keywordAuthor Accelerators -
dc.subject.keywordAuthor computer architecture -
dc.subject.keywordAuthor hardware -
dc.subject.keywordAuthor machine learning -
dc.subject.keywordAuthor recurrent neural networks (RNNs) -
dc.contributor.affiliatedAuthor Park, Junki -
dc.contributor.affiliatedAuthor Yi, Wooseok -
dc.contributor.affiliatedAuthor Ahn, Daehyun -
dc.contributor.affiliatedAuthor Kung, Jaeha -
dc.contributor.affiliatedAuthor Kim, Jae-Joon -
Files in This Item:

There are no files associated with this item.

Appears in Collections:
Department of Electrical Engineering and Computer Science Intelligent Digital Systems Lab 1. Journal Articles

qrcode

  • twitter
  • facebook
  • mendeley

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE