DGIST Scholar: RoPIM: A Processing-in-Memory Architecture for Accelerating Rotary Positional Embedding in Transformer Models

Detail View

ETC 1. Journal Articles

RoPIM: A Processing-in-Memory Architecture for Accelerating Rotary Positional Embedding in Transformer Models

Citations

WEB OF SCIENCE

Citations

SCOPUS

Metadata Downloads

XML

Excel

DC Field	Value	Language
dc.contributor.author	Jeon, Yunhyeong	-
dc.contributor.author	Jang, Minwoo	-
dc.contributor.author	Lee, Hwanjun	-
dc.contributor.author	Jung, Yeji	-
dc.contributor.author	Jung, Jin	-
dc.contributor.author	Lee, Jonggeon	-
dc.contributor.author	So, Jinin	-
dc.contributor.author	Kim, Daehoon	-
dc.date.accessioned	2025-03-06T17:10:18Z	-
dc.date.available	2025-03-06T17:10:18Z	-
dc.date.created	2025-02-14	-
dc.date.issued	2025-01	-
dc.identifier.issn	1556-6056	-
dc.identifier.uri	http://hdl.handle.net/20.500.11750/58122	-
dc.description.abstract	The emergence of attention-based Transformer models, such as GPT, BERT, and LLaMA, has revolutionized Natural Language Processing (NLP) by significantly improving performance across a wide range of applications. A critical factor driving these improvements is the use of positional embeddings, which are crucial for capturing the contextual relationships between tokens in a sequence. However, current positional embedding methods face challenges, particularly in managing performance overhead for long sequences and effectively capturing relationships between adjacent tokens. In response, Rotary Positional Embedding (RoPE) has emerged as a method that effectively embeds positional information with high accuracy and without necessitating model retraining even with long sequences. Despite its effectiveness, RoPE introduces a considerable performance bottleneck during inference. We observe that RoPE accounts for 61% of GPU execution time due to extensive data movement and execution dependencies. In this paper, we introduce RoPIM, a Processing-In-Memory (PIM) architecture designed to efficiently accelerate RoPE operations in Transformer models. RoPIM achieves this by utilizing a bank-level accelerator that reduces off-chip data movement through in-accelerator support for multiply-addition operations and minimizes operational dependencies via parallel data rearrangement. Additionally, RoPIM proposes an optimized data mapping strategy that leverages both bank-level and row-level mappings to enable parallel execution, eliminate bank-to-bank communication, and reduce DRAM activations. Our experimental results show that RoPIM achieves up to a 307.9× performance improvement and 914.1× energy savings compared to conventional systems. © IEEE.	-
dc.language	English	-
dc.publisher	Institute of Electrical and Electronics Engineers	-
dc.title	RoPIM: A Processing-in-Memory Architecture for Accelerating Rotary Positional Embedding in Transformer Models	-
dc.type	Article	-
dc.identifier.doi	10.1109/LCA.2025.3535470	-
dc.identifier.wosid	001428031000001	-
dc.identifier.scopusid	2-s2.0-85216973093	-
dc.identifier.bibliographicCitation	Jeon, Yunhyeong. (2025-01). RoPIM: A Processing-in-Memory Architecture for Accelerating Rotary Positional Embedding in Transformer Models. IEEE Computer Architecture Letters, 24(1), 41–44. doi: 10.1109/LCA.2025.3535470	-
dc.description.isOpenAccess	FALSE	-
dc.subject.keywordAuthor	Processing-in-memory	-
dc.subject.keywordAuthor	transformer model	-
dc.subject.keywordAuthor	rotary positional embedding	-
dc.citation.endPage	44	-
dc.citation.number	1	-
dc.citation.startPage	41	-
dc.citation.title	IEEE Computer Architecture Letters	-
dc.citation.volume	24	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture	-
dc.type.docType	Article	-

Show Simple Item Record

File Downloads

There are no files associated with this item.

Detail View

RoPIM: A Processing-in-Memory Architecture for Accelerating Rotary Positional Embedding in Transformer Models

File Downloads

공유

Total Views & Downloads