DGIST Scholar: Adaptive Real-Time Scheduling for Open Systems via Reinforcement Learning

Detail View

Department of Electrical Engineering and Computer Science Theses Master

Adaptive Real-Time Scheduling for Open Systems via Reinforcement Learning

Jun Won Park

Citations

WEB OF SCIENCE

Citations

SCOPUS

Metadata Downloads

XML

Excel

Title: Adaptive Real-Time Scheduling for Open Systems via Reinforcement Learning

Alternative Title: 강화학습 기반 개방형 시스템을 위한 적응형 실시간 스케줄링

DGIST Authors: Jun Won Park ; Hoon Sung Chwa ; Yeseong Kim

Advisor: 좌훈승

Co-Advisor(s): Yeseong Kim

Issued Date: 2026

Awarded Date: 2026-02-01

Type: Thesis

Description: Real-time scheduling, Reinforcement learning, Open systems, Feedback control, Safety shield

Abstract: Modern soft real-time systems increasingly operate as open systems, where task arrivals, terminations, and execution times vary unpredictably. Conventional model- based schedulers such as feedback control scheduling (FCS) and elastic scheduling de- pend on profiling and linearized models that quickly become inaccurate under dynamic workloads, resulting in degraded stability and efficiency. This thesis proposes a reinforcement learning (RL)–based adaptive real-time scheduling framework that achieves safe, scalable, and model-free control without offline profiling. The proposed approach formulates QoS-aware scheduling as a Markov Decision Process (MDP) in which each control window defines a state, action, and reward. The RL agent selects per-task QoS levels to maximize performance while regulating utilization and deadline-miss ratio (DMR) around target references. To ensure real-time safety and responsiveness, the framework integrates three key mechanisms: (i) a predictive safety shield with an online utilization model to project unsafe actions into a reference band, (ii) an adaptive window controller with directed exploration to handle workload shifts, and (iii) a branching deep Q-network (DQN) architecture that scales linearly with the number of live tasks and QoS levels. The RL policy and predictor are trained jointly from executed transitions, enabling consistent off-policy updates and safe online learning. Extensive simulations using periodic task sets demonstrate that the proposed RL framework reconciles the classical gain–stability trade-off in FCS. Under static work- loads, the RL controller achieves rapid convergence, low utilization variance, and high in-band stability, reducing the average settling time by 70.25% and improving the band-holding rate by 15.53% compared to the best FCS configuration. Under dynamic workloads with changing task sets and utilization regimes, the RL controller adapts automatically without re-profiling, shortening average settling latency by 52.1% and achieving a perfect 100% settling success rate across all runs. Although the aggressive configuration (U∗=0.97) produces slightly higher transient DMR peaks due to exploration, it maintains long-term stability and delivers higher overall QoS. Overall, this work demonstrates that reinforcement learning can safely replace model-based control in open real-time systems by learning non-linear, adaptive scheduling policies directly from interaction. Future work includes extending the framework to multi-core and distributed platforms and incorporating probabilistic safety guarantees via conformal prediction.
Key words: Real-time scheduling, Reinforcement learning, Open systems, Feedback control, Safety shield|현대의 실시간 시스템은 외부 이벤트에 따라 태스크가 동적으로 생성 · 종료되는 개방형 시스템으로 발전하고 있다. 이러한 환경에서 기존의 모델 기반 스케줄러(FCS, Elastic Scheduling)는 사전 프로파일링과 선형 근사 모델에 의존하므로 새로운 태스크나 급격한 부하 변화 시 안정성과 효율성을 유지하기 어렵다. 본 논문은 강화학습(Reinforcement Learning, RL) 기반의 모델프리 적응형 실시간 스케줄링 프레임워크를 제안한다. 제안 기법은 윈도우 단위로 시스템 상태(이용률 · 데드라인 미스율)를 관찰하고, 각 태스크의 QoS 수준을 제어 입력으로 결정한다. RL 정책의 안전성을 보장하기 위해 (i) 예측 기반 Safety Shield, (ii) 적응적 윈도우 제어 및 탐색 전략, (iii) 선형 확장 가능한 Branching DQN 구조를 통합하였다. 제안된 프레임워크는 프로파일링 없이 안전한 온라인 학습을 수행하며, 정적 환경에서는 FCS 대비 평균 정착시간을 70.25% 단축하고 변동성을 18.88% 감소시켰다. 또한 동적 태스크셋 실험에서 100% 정착 성공률과 높은 QoS를 유지하였다. 본 연구는 RL을 통해 기존 제어 이론 기반 스케줄링의 한계를 극복하고, 예측 불가능한 개방형 실시간 환경에서도 안전하고 효율적인 제어를 가능하게 함을 보였다.
핵심어: 실시간 스케줄링, 강화학습, 개방형 시스템, 피드백 제어, 안전 쉴드
더보기

Table Of Contents: 1 Introduction 1
2 Motivation 4
2.1 Open Systems 4
2.2 Challenges in Scheduling Open Systems 5
2.3 QoS-aware Scheduling 6
3 Background and Related Works 8
3.1 Feedback Control Real-Time Scheduling 8
3.2 Reinforcement Learning 9
4 Problem Formulation 11
4.1 Problem Formulation 12
4.2 MDP Transformation 13
4.3 Transformation to Model-free RL 13
5 Proposed Framework 15
5.1 Design Overview 15
5.2 RL Challenges 17
5.3 Safety Shield with Utilization Predictor 18
5.4 Adaptation and Window Control 18
5.5 Action Branching 20
5.6 Learning Signals 21
5.7 Structure of the Proposed Framework 23
6 Evaluation 27
6.1 Evaluation Setup 27
6.2 Static Task Set Experiments 29
6.3 Dynamic Task Set Experiments 35
7 Conclusion 41
References 43

URI: https://scholar.dgist.ac.kr/handle/20.500.11750/59708
http://dgist.dcollection.net/common/orgView/200000943048

DOI: 10.22677/THESIS.200000943048

Degree: Master

Department: Department of Electrical Engineering and Computer Science

Publisher: DGIST

Show Full Item Record

File Downloads

There are no files associated with this item.

Detail View

Adaptive Real-Time Scheduling for Open Systems via Reinforcement Learning

File Downloads

공유

Total Views & Downloads