Detail View

DC Field Value Language
dc.contributor.author Martin Garaj -
dc.contributor.author Kim, Kisub -
dc.contributor.author Alexis Stockinger -
dc.date.accessioned 2026-02-10T17:10:14Z -
dc.date.available 2026-02-10T17:10:14Z -
dc.date.created 2026-02-10 -
dc.date.issued 2025-10-30 -
dc.identifier.isbn 9781643686318 -
dc.identifier.issn 0922-6389 -
dc.identifier.uri https://scholar.dgist.ac.kr/handle/20.500.11750/60039 -
dc.description.abstract Understanding the sparsity patterns of the self-attention mechanism in modern Large Language Models (LLMs) has become increasingly important for improving computational efficiency. Motivated by empirical observations, numerous algorithms assume specific sparsity structures within self-attention. In this work, we rigorously examine five common conjectures about self-attention sparsity frequently addressed in recent literature: (1) attention width decreases through network depth, (2) attention heads form distinct behavioral clusters, (3) recent tokens receive high attention, (4) the first token maintains consistent focus, and (5) semantically important tokens persistently attract attention. Our analysis uses over 4 million attention weight vectors from Llama3-8B collected over long-context benchmark LongBench to achieve statistically significant results. Our findings strongly support conjectures regarding recent token attention (3) and first-token focus (4). We find partial support for head clustering (2) and the Persistence of Attention Hypothesis (5), suggesting these phenomena exist but with important qualifications. Regarding attention width (1), our analysis reveals a more nuanced pattern than commonly assumed, with attention width peaking in middle layers rather than decreasing monotonically with depth. These insights suggest that effective sparse attention algorithms should preserve broader attention patterns in middle layers while allowing more targeted pruning elsewhere, offering evidence-based guidance for more efficient attention mechanism design. © 2025 The Authors. -
dc.language English -
dc.publisher IOS Press -
dc.relation.ispartof Frontiers in Artificial Intelligence and Applications -
dc.title Investigating Sparsity of Self-Attention -
dc.type Conference Paper -
dc.identifier.doi 10.3233/FAIA251302 -
dc.identifier.scopusid 2-s2.0-105024462083 -
dc.identifier.bibliographicCitation 28th European Conference on Artificial Intelligence, ECAI 2025, including 14th Conference on Prestigious Applications of Intelligent Systems, PAIS 2025, pp.4113 - 4120 -
dc.identifier.url https://ecai2025.org/detailed-schedule/ -
dc.citation.conferenceDate 2025-10-25 -
dc.citation.conferencePlace IT -
dc.citation.conferencePlace Bologna -
dc.citation.endPage 4120 -
dc.citation.startPage 4113 -
dc.citation.title 28th European Conference on Artificial Intelligence, ECAI 2025, including 14th Conference on Prestigious Applications of Intelligent Systems, PAIS 2025 -
Show Simple Item Record

File Downloads

  • There are no files associated with this item.

공유

qrcode
공유하기

Related Researcher

김기섭
Kim, Kisub김기섭

Department of Electrical Engineering and Computer Science

read more

Total Views & Downloads

???jsp.display-item.statistics.view???: , ???jsp.display-item.statistics.download???: