Repository Community: null

Repository Community: null https://scholar.dgist.ac.kr/handle/20.500.11750/58907 2026-07-10T06:38:21Z Investigating Sparsity of Self-Attention https://scholar.dgist.ac.kr/handle/20.500.11750/60039 Title: Investigating Sparsity of Self-Attention Author(s): Martin Garaj; Kim, Kisub; Alexis Stockinger Abstract: Understanding the sparsity patterns of the self-attention mechanism in modern Large Language Models (LLMs) has become increasingly important for improving computational efficiency. Motivated by empirical observations, numerous algorithms assume specific sparsity structures within self-attention. In this work, we rigorously examine five common conjectures about self-attention sparsity frequently addressed in recent literature: (1) attention width decreases through network depth, (2) attention heads form distinct behavioral clusters, (3) recent tokens receive high attention, (4) the first token maintains consistent focus, and (5) semantically important tokens persistently attract attention. Our analysis uses over 4 million attention weight vectors from Llama3-8B collected over long-context benchmark LongBench to achieve statistically significant results. Our findings strongly support conjectures regarding recent token attention (3) and first-token focus (4). We find partial support for head clustering (2) and the Persistence of Attention Hypothesis (5), suggesting these phenomena exist but with important qualifications. Regarding attention width (1), our analysis reveals a more nuanced pattern than commonly assumed, with attention width peaking in middle layers rather than decreasing monotonically with depth. These insights suggest that effective sparse attention algorithms should preserve broader attention patterns in middle layers while allowing more targeted pruning elsewhere, offering evidence-based guidance for more efficient attention mechanism design. © 2025 The Authors. 2025-10-29T15:00:00Z Learning to represent code changes https://scholar.dgist.ac.kr/handle/20.500.11750/60013 Title: Learning to represent code changes Author(s): Tang, Xunzhu; Tian, Haoye; Pian, Weiguo; Ezzini, Saad; Kabore, Abdoul Kader; Habib, Andrew; Klein, Jacques; Bissyande, Tegawende F.; Kim, Kisub Abstract: Code change representation plays a pivotal role in automating numerous software engineering tasks, such as classifying code change correctness or generating natural language summaries of code changes. Recent studies have leveraged deep learning to derive effective code change representation, primarily focusing on capturing changes in token sequences or Abstract Syntax Trees (ASTs). However, these current state-of-the-art representations do not explicitly calculate the intention semantic induced by the change on the AST, nor do they effectively explore the surrounding contextual information of the modified lines. To address this, we propose a new code change representation methodology, Patcherizer, which we refer to as our tool. This innovative approach explores the intention features of the context and structure, combining the context around the code change along with two novel representations. These new representations capture the sequence intention inside the code changes in the code change and the graph intention inside the structural changes of AST graphs before and after the code change. This comprehensive representation allows us to better capture the intentions underlying a code change. Patcherizer builds on graph convolutional neural networks for the structural input representation of the intention graph and on transformers for the intention sequence representation. We assess the generalizability of Patcherizer 's learned embeddings on three tasks: (1) Generating code change description in NL, (2) Predicting code change correctness in program repair, and (3) Code change intention detection. Experimental results show that the learned code change representation is effective for all three tasks and achieves superior performance to the state-of-the-art (SOTA) approaches. For instance, on the popular task of code change description generation (a.k.a. commit message generation), Patcherizer achieves an average improvement of 19.39%, 8.71%, and 34.03% in terms of BLEU, ROUGE-L, and METEOR metrics, respectively. 2026-04-30T15:00:00Z