Repository Collection: null

Repository Collection: null https://scholar.dgist.ac.kr/handle/20.500.11750/93 2026-06-04T00:54:08Z A Dual-Path Phase Interpolator with Enhanced Linearity for Clock and Data Recovery https://scholar.dgist.ac.kr/handle/20.500.11750/59842 Title: A Dual-Path Phase Interpolator with Enhanced Linearity for Clock and Data Recovery Author(s): Hyeonho Lee Abstract: This paper proposes a dual-path phase interpolator with enhanced linearity for adaptive compensation of amplitude-to-phase conversion distortion, aiming to improve the accuracy of the clock used in clock and data recovery (CDR) circuits. The proposed architecture uses two integral-type phase interpolators and combines their outputs to generate a signal with more accurate phase characteristics. In particular, by considering amplitude-to-phase conversion distortion caused by the variation in output amplitude depending on the input state of the phase interpolator, the input configuration of the phase interpolator is adjusted according to the degree of distortion, thereby optimizing the linearity of the phase interpolator. Previous studies attempted to address amplitude-to-phase distortion by using multiple phases or simple coupling between phase interpolators. They also attempted to secure the linearity of the phase interpolator by creating a lookup table that maps each input state to a corresponding output. However, these methods struggled to adapt to varying levels of distortion, making it difficult to consistently achieve optimal integral non-linearity (INL). Therefore, a new approach was needed to adjust the phase configuration in real-time based on the observed phase. The proposed method adjusts the input state of the phase interpolator using a digital control scheme, based on the measured phase difference obtained through asynchronous sampling, until a targeted phase difference is achieved. This allows the interpolator to maintain optimal phase accuracy even in the presence of varying integral non-linearity caused by amplitude-to-phase distortion. Keywords: Dual-path phase interpolator, Integrating-mode phase interpolator, AM-to-PM conversion distortion, Integral non-linearity, Asynchronous sampling| 본 논문에서는 클럭 및 데이터 복구 회로에 사용되는 클럭의 정확성을 높이기 위하여 진폭-위상 변환 왜곡 적응을 위한 선형성을 향상시킨 이중 경로 위상 보간기를 제안한다. 제안된 구조는 적분 방식 위상 보간기를 두 개를 사용하여 각각의 신호를 결합하여 보다 정확한 위상을 가지는 신호를 만들어 내었다. 특히, 위상 보간기의 입력 상태에 따라 다르게 출력되는 진폭으로 인해 발생하는 진폭-위상 변환 왜곡을 고려하여 위상 보간기의 입력 상태에 대한 조성을 왜곡의 정도에 따라 조절함으로써 최적화된 위상 보간기의 선형성을 최적화할 수 있었다. 이전 연구에서는 진폭-위상 변환 왜곡을 다중 위상의 사용이나 단순히 위상 보간기 간의 결합으로 해결하려 하였고, 위상 보간기의 입력 상태에 대해 순람표를 만들어 일대일 대응 방식을 통해 위상 보간기의 선형성을 확보하려는 시도가 있었다. 그러나 해당 방식들은 공정, 온도, 및 전압 변화에 따라 발생하는 왜곡에 대응하지 못하였기 때문에 매번 최적의 적분 선형성을 확보하는 것에 어려움을 겪었다. 따라서 위상 보간기의 위상을 실시간으로 확인하여 위상의 조성을 실시간으로 바꿔주는 방식에 대한 연구가 필요했다. 제안된 위상보간기의 입력 상태에 대한 제어 방식은 두 개의 적분 장식 위상 보간기의 신호에 대한 비동기 샘플링을 통해 얻은 위상 차이 정도를 얻어낸다. 얻어낸 위상 차이 정도를 바탕으로 디지털 제어 방식을 통해 설정된 위상 차이 정도에 도달할 때까지 입력 상태를 조정하였다. 이를 통하여 진폭-위상 변환 왜곡으로 인한 위상 보간기의 적분 비선형성의 변화에도 최적의 위상 정확도를 가지도록 하였다. 핵심어: 이중 경로 위상 보간기, 적분 방식 위상 보간기, 진폭-위상 변환 왜곡, 적분 비선형성, 비동기 샘플링 Description: Dual-path phase interpolator, Integrating-mode phase interpolator, AM-to-PM conversion, Integral non-linearity, Asynchronous sampling 2024-12-31T15:00:00Z Autonomous Vehicle Offloading Optimization in Congested Networks: A Large-Scale Virtual Testbed Analysis https://scholar.dgist.ac.kr/handle/20.500.11750/59841 Title: Autonomous Vehicle Offloading Optimization in Congested Networks: A Large-Scale Virtual Testbed Analysis Author(s): Yongjae Jang Abstract: 본 연구는 RSU–클라우드 기반의 3계층 분산 아키텍처 환경에서 자율주행 차량의 연산 및 통신 부하를 효율적으로 관리하기 위한 동적 로드밸런싱 기법의 성능을 대규모 테스트베드 기반 시뮬레이션을 통해 정량적으로 분석하였다. 기존의 로컬 연산(Local Only) 또는 전면 오프로딩(Offloading Only) 방식은 차량 밀도 및 네트워크 부하에 따라 큐 지연과 에너지 소비가 급증하는 한계를 갖는다. 이를 극복하기 위해, 본 연구에서는 연산 자원 상태와 네트워크 환경(PDR, 링크 품질 등)을 동시에 고려하는 로드밸런싱 전략을 적용하고, 그 효과를 다각도로 검증하였다. 시뮬레이션 환경은 실제 도로망(인천 청라지구)을 기반으로 구축되었으며, 총 400대의 차량(OBU), 50개의 RSU, 1개의 클라우드 서버로 구성된 대규모 분산 테스트베드를 구현하였다. 차량들은 주기적으로 센서 기반 애플리케이션 데이터를 생성하고, 각 시간 슬롯마다 로컬 처리 혹은 외부 오프로딩 여부를 결정한다. 실험 결과, 제안된 로드밸런싱 방식은 기존 방식 대비 OBU, RSU, 클라우드 전체에서 큐 길이 안정화, 평균 PDR 향상, 에너지 소비 감소 등에서 우수한 성능을 보였으며, 특히 보통 수준의 차량 혼잡 환경에서 자원 활용 효율과 통신 신뢰성이 모두 안정적으로 유지됨을 확인하였다. 다만, 차량 밀도가 급격히 증가하는 시점에서는 특정 RSU에 트래픽이 집중되며 병목 현상 및 PDR 저하가 발생하였고, 이로 인해 시스템 전반에 혼잡이 확산되는 파급 효과(Ripple Effect) 또한 관측되었다. 이는 로드밸런싱만으로는 물리적인 통신 자원 제약을 완전히 극복하기 어렵다는 점과, 특정 지역에의 차량 집중이 전체 네트워크 성능에 부정적 영향을 줄 수 있음을 시사한다. 본 연구는 현실적인 도시 구조와 차량 통신 조건을 반영한 시뮬레이션 기반 평가를 통해, 제안 기법의 실효성과 한계점을 동시에 검증 및 분석하였다. 이를 통해 자율주행 차량 통신 인프라의 설계 및 운영에 있어 핵심적인 기준과 기초 데이터를 제시하며, 향후 실증 기반 자원 제어 기술 및 지능형 네트워크 운영 전략 개발로의 확장 가능성을 제시한다.|Autonomous vehicles are evolving into intelligent platforms that must simultaneously support safety-critical control functions and process large volumes of sensor-based application data. As a result, efficient load balancing strategies that offload computational tasks to RSUs (edge servers) and cloud infrastructure are becoming increasingly essential to overcome the limitations of onboard resources and ensure real-time performance. This study implements a large-scale simulation testbed based on the real road network of Cheongna District in Incheon, Korea consisting of 400 vehicles, 50 RSUs, and one cloud server, to quantitatively evaluate the performance of a dynamic load balancing scheme that distributes tasks according to both computing and network resource states. Compared to the conventional Local Only and Offloading Only approaches, the proposed method demonstrated improved performance in terms of queue stability, packet delivery ratio (PDR), and energy efficiency. In particular, under moderate traffic congestion, the system maintained a stable balance between real-time responsiveness and energy consumption. However, when the vehicle density sharply increased, traffic became concentrated on specific RSUs, leading to communication bottlenecks and the spread of interference to neighboring RSUs exhibiting a ripple effect of congestion. These results empirically reveal the operational limits of load balancing in high-density urban environments. This analysis highlights the need for dynamic resource management strategies that account for both scalability and reliability in the design of future autonomous vehicle communication infrastructures. It also provides a foundational reference for developing integrated control technologies that combine networking and computing under real-world constraints. Description: Autonomous Driving, Computation Offloading, Vehicular Edge Computing (VEC), Network-Computing Resource Optimization, Simulator 2024-12-31T15:00:00Z A Lightweight CXL Memory Emulation Framework for Modern AI Workload Exploration https://scholar.dgist.ac.kr/handle/20.500.11750/59840 Title: A Lightweight CXL Memory Emulation Framework for Modern AI Workload Exploration Author(s): HOYEON LEE Abstract: Compute Express Link (CXL) has emerged as a promising solution for disaggregated memory systems, enabling both memory pooling and near-data processing (NDP) capabilities. However, the limited availability of physical CXL hardware poses significant challenges for researchers and developers seeking to evaluate its performance under real-world conditions. In this paper, we present CXLite, a lightweight emulation framework for CXL devices, designed to replicate the key characteristics of CXL systems within a NUMA-based environment. CXLite simulates flit-level memory access, latency, and bandwidth characteristics of CXL, while also supporting NDP offloading. To demonstrate the capabilities of CXLite, we evaluate its latency behavior across various configurations using synthetic benchmarks, and assess the performance impact of CXL memory using SPEC CPU2017 workloads. Furthermore, we explore the effectiveness of CXLite in handling modern AI workloads, including large language models (LLMs), recommender systems, and graph neural networks (GNNs). The results demonstrate that CXLite accurately replicates the behavior of CXL systems and provides valuable insights into the performance implications of CXL devices and near-data processing for memory-intensive workloads. Keywords: Computer Express Link, Emulation, Near Data Processing|본 논문은 차세대 메모리 인터커넥트 기술인 Compute Express Link(CXL)의 실제 하드웨어 접근성이 낮다는 문제를 해결하기 위해, 고성능 AI 워크로드의 실행 환경을 모사할 수 있는 경량 에뮬레이션 프레임워크인 CXLite를 제안한다. CXL은 고대역폭·저지연의 메모리 확장 및 Near-Data Processing(NDP)을 지원하는 기술로, 특히 대용량 메모리와 불규칙한 접근 패턴을 요구하는 LLM, 추천 시스템, 그래프 신경망(GNN) 등의 워크로드에 적합하다. 그러나 CXL Type 2 및 Type 3 디바이스는 상용화되지 않았거나 매우 제한된 형태로 존재하며, 기존 에뮬레이터는 정확한 지연 특성 모사와 NDP 기능 지원에 한계가 있다. CXLite는 NUMA 기반 시스템 위에서 flit 단위(256바이트)의 접근 지연을 동적으로 주입하며, PyTorch와의 통합을 통해 실제 AI 프레임워크 환경에서 CXL 메모리 시스템을 재현한다. 또한, 메시지 큐 기반의 요청 처리 구조를 통해 NDP 요청을 원격 메모리 노드로 오프로딩하고, 성능 모니터링 및 지연 삽입을 통해 다양한 CXL 토폴로지를 모사할 수 있다. 실험 결과, CXLite는 SPEC2017, LLM, DLRM, GNN 등 다양한 워크로드에 대해 실제 CXL 지연 특성과 성능 양상을 충실히 재현하며, NDP 기능이 포함될 경우 최대 14 배까지 지연 개선 효과를 보여준다. 본 프레임워크는 CXL 시스템 설계와 소프트웨어 최적화를 위한 현실적인 평가 환경을 제공하며, 향후 CXL 기반 메모리 아키텍처 연구와 AI 응용에 중요한 기여를 할 수 있다. Description: Computer Express Link, Emulation, AI 2024-12-31T15:00:00Z A Voltage-Sensing-based Logic-Compatible Embedded NAND Flash Compute-In-Memory Macro for Edge Device https://scholar.dgist.ac.kr/handle/20.500.11750/59839 Title: A Voltage-Sensing-based Logic-Compatible Embedded NAND Flash Compute-In-Memory Macro for Edge Device Author(s): Junhyeok Lee Abstract: Compute-in-memory (CIM) has emerged as a promising architecture to address the memory wall in conventional von-Neumann systems, particularly for data-intensive applications in machine learning (ML) and edge computing. Among various non-volatile memory (NVM) technologies, embedded flash (eFlash) is attractive for CIM due to its high density, low-power operation, and compatibility with standard CMOS processes. This thesis presents the first implementation of voltage-sensing-based multiply-and-accumulate (MAC) operations in an eFlash-based CIM macro. The proposed architecture improves area efficiency and scalability by eliminating the computation-related peripheral circuits used in prior eFlash-based CIM implementations. A logic-compatible embedded NAND (eNAND) flash structure is employed to support high-density integration and multi-level synaptic weight representation. To ensure reliable operation, location-dependent reference and pass voltages are applied, compensating for variations along the eNAND string. The sensing margin is further improved by dynamically selecting between single-ended and differential sensing modes based on input conditions. The proposed CIM macro is fabricated in TSMC 65nm GP CMOS technology with an active area of 0.61 mm². Measurement results confirm successful MAC computation using voltage sensing. The proposed architecture offers a compact and energy-efficient solution for scalable CIM in next-generation edge devices.|인-메모리 연산은 기존 폰 노이만 구조에서 발생하는 메모리 병목 문제를 해결하기 위한 유망한 구조로 주목받고 있으며, 특히 머신러닝과 엣지 컴퓨팅과 같은 데이터 집약적 응용 분야에 적합하다. 다양한 비휘발성 메모리 기술 중 임베디드 플래시는 높은 집적도, 저전력 특성, 표준 CMOS 공정과의 높은 호환성으로 인해 인-메모리 연산 구현에 적합한 메모리 기술로 평가된다. 본 논문에서는 임베디드 플래시 기반 인-메모리 연산 구조에서 전압 센싱 기반의 곱셈-누산 연산을 최초로 구현하였다. 제안된 구조는 기존 구조에서 연산을 위해 필요했던 주변 회로를 제거하여 면적 효율성과 확장성을 개선하였다. 고밀도 집적과 다중 수준 시냅스 가중치 표현을 위해 로직 호환형 임베디드 NAND 플래시 구조를 채택하였다. 신뢰성 있는 동작을 위해 셀의 물리적 위치에 따라 기준 전압과 패스 전압을 달리 적용하였으며, 입력 조건에 따라 단말 또는 차동 감지 방식을 선택하여 감지 마진을 향상시켰다. 제안된 인-메모리 연산 매크로는 TSMC 65나노미터 CMOS 공정으로 제작되었으며, 유효 면적은 0.61 제곱밀리미터이다. 측정 결과를 통해 전압 판독 기반 MAC 연산의 정확한 동작을 확인하였다. 본 연구는 차세대 엣지 디바이스를 위한 소형화되고 에너지 효율적인 인-메모리 연산 구조를 제시한다. Description: Compute-in-memory, Non-volatile memory, Embedded Flash, Embedded NAND, MAC operation|인-메모리 연산, 비휘발성 메모리, 임베디드 플래시, 임베디드 NAND, MAC 연산 2024-12-31T15:00:00Z