Detail View

A Dual-Precision and Low-Power CNN Inference Engine Using a Heterogeneous Processing-in-Memory Architecture
Citations

WEB OF SCIENCE

Citations

SCOPUS

Metadata Downloads

DC Field Value Language
dc.contributor.author Jung, Sangwoo -
dc.contributor.author Lee, Jaehyun -
dc.contributor.author Park, Dahoon -
dc.contributor.author Lee, Youngjoo -
dc.contributor.author Yoon, Jong-Hyeok -
dc.contributor.author Kung, Jaeha -
dc.date.accessioned 2024-11-01T18:10:17Z -
dc.date.available 2024-11-01T18:10:17Z -
dc.date.created 2024-05-27 -
dc.date.issued 2024-12 -
dc.identifier.issn 1549-8328 -
dc.identifier.uri http://hdl.handle.net/20.500.11750/57100 -
dc.description.abstract In this article, we present an energy-scalable CNN model that can adapt to different hardware resource constraints. Specifically, we propose a dual-precision network, named DualNet, that leverages two independent bit-precision paths (INT4 and ternary-binary). DualNet achieves both high accuracy and low complexity by balancing the ratio between two paths. We also present an evolutionary algorithm that allows the automatic search of the optimal ratios. In addition to the novel CNN architecture design, we develop a heterogeneous processing-in-memory (PIM) hardware that integrates SRAM-and eDRAM-based PIMs to efficiently compute two precision paths in parallel. To verify the energy efficiency of DualNet computed on the heterogeneous PIM, we prototyped a test chip in 28nm CMOS technology. To maximize the hardware efficiency, we utilize an improved data mapping scheme achieving the most effective deployment of DualNets on multiple PIM arrays. With the proposed SW-HW co-optimization, we can obtain the most energy-efficient DualNet model operating on the actual PIM hardware. Compared to the other quantized networks with a single bit-precision, DualNet reduces the energy consumption, memory footprint, and latency by 29.0%, 49.5%, 47.3% on average, respectively, for CIFAR-10/100 and ImageNet datasets. IEEE -
dc.language English -
dc.publisher Institute of Electrical and Electronics Engineers -
dc.title A Dual-Precision and Low-Power CNN Inference Engine Using a Heterogeneous Processing-in-Memory Architecture -
dc.type Article -
dc.identifier.doi 10.1109/TCSI.2024.3395842 -
dc.identifier.wosid 001226144000001 -
dc.identifier.scopusid 2-s2.0-85193248367 -
dc.identifier.bibliographicCitation Jung, Sangwoo. (2024-12). A Dual-Precision and Low-Power CNN Inference Engine Using a Heterogeneous Processing-in-Memory Architecture. IEEE Transactions on Circuits and Systems I: Regular Papers, 71(12), 5546–5559. doi: 10.1109/TCSI.2024.3395842 -
dc.description.isOpenAccess FALSE -
dc.subject.keywordAuthor Convolutional neural networks -
dc.subject.keywordAuthor deep learning -
dc.subject.keywordAuthor Hardware -
dc.subject.keywordAuthor Memory management -
dc.subject.keywordAuthor mixed-precision quantization -
dc.subject.keywordAuthor processing-in-memory -
dc.subject.keywordAuthor Quantization (signal) -
dc.subject.keywordAuthor Random access memory -
dc.subject.keywordAuthor SW-HW co-optimization -
dc.subject.keywordAuthor Computational modeling -
dc.subject.keywordAuthor Convolution -
dc.citation.endPage 5559 -
dc.citation.number 12 -
dc.citation.startPage 5546 -
dc.citation.title IEEE Transactions on Circuits and Systems I: Regular Papers -
dc.citation.volume 71 -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.relation.journalResearchArea Engineering -
dc.relation.journalWebOfScienceCategory Engineering, Electrical & Electronic -
dc.type.docType Article -
Show Simple Item Record

File Downloads

  • There are no files associated with this item.

공유

qrcode
공유하기

Related Researcher

윤종혁
Yoon, Jong-Hyeok윤종혁

Department of Electrical Engineering and Computer Science

read more

Total Views & Downloads