Repository Collection: null

Repository Collection: null https://scholar.dgist.ac.kr/handle/20.500.11750/12963 Tue, 14 Jul 2026 13:40:02 GMT 2026-07-14T13:40:02Z DeepPM: Predicting Performance and Energy Consumption of Program Binaries Using Transformers https://scholar.dgist.ac.kr/handle/20.500.11750/60010 Title: DeepPM: Predicting Performance and Energy Consumption of Program Binaries Using Transformers Author(s): Shim, Jun S.; Chang, Hyeonji; Kim, Yeseong; Kim, Jihong Abstract: Accurate estimation of performance and energy consumption is critical for optimizing application efficiency on diverse hardware platforms. Traditional methods often rely on profiling and measurements, requiring at least one execution, making them time-consuming and resource-intensive. This article introduces the Deep Power Meter (DeepPM) framework, leveraging deep learning, specifically the Transformer architecture, to predict performance and energy consumption of basic blocks directly from compiled binaries, eliminating the need for explicit measurement processes. The DeepPM model effectively learns the performance and energy consumption of basic blocks, enabling accurate predictions for each. Furthermore, the framework enhances applicability across different ISAs and microarchitectures, addressing limitations of state-of-the-art ML-based techniques restricted to specific processor architectures. Experimental results using the SPEC CPU 2017 benchmark suite show that DeepPM achieves significantly lower prediction errors compared to state-of-the-art ML-based techniques, with a 24% improvement in performance and an 18% improvement in energy consumption for x86 basic blocks, and similar gains for ARM processors. Fine-tuning with minimal data from the Phoronix Test Suite further validates DeepPM’s robustness, achieving an error of approximately 13.7%, close to the fully trained model’s 13.3% error. These findings demonstrate DeepPM’s ability to enhance the accuracy and efficiency of performance and energy consumption predictions, making it a valuable tool for optimizing computing systems across diverse hardware environments. © 2025 Elsevier B.V., All rights reserved. Fri, 31 Oct 2025 15:00:00 GMT https://scholar.dgist.ac.kr/handle/20.500.11750/60010 2025-10-31T15:00:00Z Diffusion-Based Generative System Surrogates for Scalable Learning-Driven Optimization in Virtual Playgrounds https://scholar.dgist.ac.kr/handle/20.500.11750/58506 Title: Diffusion-Based Generative System Surrogates for Scalable Learning-Driven Optimization in Virtual Playgrounds Author(s): Lee, Junyoung; Kim, Seohyun; Jang, Shinhyoung; Park, Jongho; Kim, Yeseong Abstract: In this paper, we introduce DiffNEST, a diffusion-based surrogate framework for scalable, learning-driven optimization in complex computing environments. The growing complexity of modern systems often renders traditional optimization techniques inefficient, while reinforcement learning (RL)-based methods struggle with high data collection costs and hardware constraints. DiffNEST employs a diffusion model to generate realistic, continuous system traces, enabling optimization without reliance on physical hardware. DiffNEST generates realistic traces that reflect diverse workload characteristics, facilitating rapid exploration of large optimization search spaces. A case study demonstrates that DiffNEST can accelerate real-world optimization tasks, achieving up to 50% improvement in task-aware adaptive DVFS and 16% in multi-core cache allocation compared to RL approaches trained directly on physical hardware. Through fine-tuning, we show that DiffNEST can also be reused across multiple optimization tasks and workload domains, indicating its potential as a general-purpose surrogate modeling framework for system-level optimization. The code is publicly available to facilitate further research and development. © 2025 Copyright held by the owner/author(s) Sat, 31 May 2025 15:00:00 GMT https://scholar.dgist.ac.kr/handle/20.500.11750/58506 2025-05-31T15:00:00Z Advancing Hyperdimensional Computing Based on Trainable Encoding and Adaptive Training for Efficient and Accurate Learning https://scholar.dgist.ac.kr/handle/20.500.11750/57405 Title: Advancing Hyperdimensional Computing Based on Trainable Encoding and Adaptive Training for Efficient and Accurate Learning Author(s): Kim, Jiseung; Lee, Hyunsei; Imani, Mohsen; Kim, Yeseong Abstract: Hyperdimensional computing (HDC) is a computing paradigm inspired by the mechanisms of human memory, characterizing data through high-dimensional vector representations, known as hypervectors. Recent advancements in HDC have explored its potential as a learning model, leveraging its straightforward arithmetic and high efficiency. The traditional HDC frameworks are hampered by two primary static elements: randomly generated encoders and fixed learning rates. These static components significantly limit model adaptability and accuracy. The static, randomly generated encoders, while ensuring high-dimensional representation, fail to adapt to evolving data relationships, thereby constraining the model's ability to accurately capture and learn from complex patterns. Similarly, the fixed nature of the learning rate does not account for the varying needs of the training process over time, hindering efficient convergence and optimal performance. This article introduces TrainableHD, a novel HDC framework that enables dynamic training of the randomly generated encoder depending on the feedback of the learning data, thereby addressing the static nature of conventional HDC encoders. TrainableHD also enhances the training performance by incorporating adaptive optimizer algorithms in learning the hypervectors. We further refine TrainableHD with effective quantization to enhance efficiency, allowing the execution of the inference phase in low-precision accelerators. Our evaluations demonstrate that TrainableHD significantly improves HDC accuracy by up to 27.99% (averaging 7.02%) without additional computational costs during inference, achieving a performance level comparable to state-of-the-art deep learning models. Furthermore, TrainableHD is optimized for execution speed and energy efficiency. Compared to deep learning on a low-power GPU platform like NVIDIA Jetson Xavier, TrainableHD is 56.4 times faster and 73 times more energy efficient. This efficiency is further augmented through the use of Encoder Interval Training (EIT) and adaptive optimizer algorithms, enhancing the training process without compromising the model's accuracy. Copyright © 2024 held by the owner/author(s). Sat, 31 Aug 2024 15:00:00 GMT https://scholar.dgist.ac.kr/handle/20.500.11750/57405 2024-08-31T15:00:00Z OpenHD: A GPU-Powered Framework for Hyperdimensional Computing https://scholar.dgist.ac.kr/handle/20.500.11750/17283 Title: OpenHD: A GPU-Powered Framework for Hyperdimensional Computing Author(s): Kang, Jaeyoung; Khaleghi, Behnam; Rosing, Tajana; Kim, Yeseong Abstract: Hyperdimensional computing (HDC) has emerged as an alternative lightweight learning solution to deep neural networks. A key characteristic of HDC is the great extent of parallelism that can facilitate hardware acceleration. However, previous hardware implementations of HDC seldom focus on GPU designs, which were also inefficient partly due to the complexity of accelerating HDC on GPUs. In this paper, we present OpenHD, a flexible and high-performance GPU-powered framework for automating the mapping of general HDC applications including classification and clustering to GPUs. OpenHD takes advantage of memory optimization strategies specialized for HDC, minimizing the access time to different memory subsystems, and removing redundant operations. We also propose a novel training method to enable data parallelism in the HDC training. Our evaluation result shows that the proposed training rapidly achieves the target accuracy, reducing the required training epochs by 4x. With OpenHD, users can deploy GPU-accelerated HDC applications without domain expert knowledge. Compared to the state-of-the-art GPU-powered HDC implementation, our evaluation on NVIDIA Jetson TX2 shows that OpenHD is up to 10.5x and 314x faster for HDC-based classification and clustering, respectively. Compared with non-HDC classification and clustering on GPUs, HDC powered by OpenHD, is 11.7x and 53x faster at comparable accuracy. IEEE Mon, 31 Oct 2022 15:00:00 GMT https://scholar.dgist.ac.kr/handle/20.500.11750/17283 2022-10-31T15:00:00Z