Detail View

Cutting-Edge Inference: Dynamic DNN Model Partitioning and Resource Scaling for Mobile AI
Citations

WEB OF SCIENCE

Citations

SCOPUS

Metadata Downloads

DC Field Value Language
dc.contributor.author Lim, Jeong-A -
dc.contributor.author Lee, Joohyun -
dc.contributor.author Kwak, Jeongho -
dc.contributor.author Kim, Yeongjin -
dc.date.accessioned 2024-12-23T22:10:16Z -
dc.date.available 2024-12-23T22:10:16Z -
dc.date.created 2024-10-10 -
dc.date.issued 2024-11 -
dc.identifier.issn 1939-1374 -
dc.identifier.uri http://hdl.handle.net/20.500.11750/57402 -
dc.description.abstract Recently, applications using artificial intelligence (AI) technique in mobile devices such as augmented reality have been extensively pervasive. The hardware specifications of mobile devices, dynamic service demands, stochastic network states, and characteristics of DNN (Deep Neural Network) models affect the quality of experience (QoE) of such applications. In this paper, we propose CutEdge, that leverages a virtual queue-based Lyapunov optimization framework to jointly optimize DNN model partitioning between a mobile device and a mobile edge computing (MEC) server and processing/networking resources in a mobile device with respect to internal/external system dynamics. Specifically, CutEdge makes decisions of (i) the partition point of DNN model between the mobile device and MEC server, (ii) GPU clock frequency, and (iii) transmission rates in a mobile device, simultaneously. Then, we theoretically show the optimal trade-off curves among energy consumption, throughput, and end-to-end latency yielded by CutEdge where such QoE metrics have not been jointly addressed in the previous studies. Moreover, we show the impact of joint optimization of three control parameters on the performances via real trace-driven simulations. Finally, we show the superiority of CutEdge over the existing algorithms by experiment on top of implemented testbed using an embedded AI device and an MEC server. © IEEE. -
dc.language English -
dc.publisher Institute of Electrical and Electronics Engineers -
dc.title Cutting-Edge Inference: Dynamic DNN Model Partitioning and Resource Scaling for Mobile AI -
dc.type Article -
dc.identifier.doi 10.1109/TSC.2024.3466848 -
dc.identifier.wosid 001386516500010 -
dc.identifier.scopusid 2-s2.0-85205146701 -
dc.identifier.bibliographicCitation Lim, Jeong-A. (2024-11). Cutting-Edge Inference: Dynamic DNN Model Partitioning and Resource Scaling for Mobile AI. IEEE Transactions on Services Computing, 17(6), 3300–3316. doi: 10.1109/TSC.2024.3466848 -
dc.description.isOpenAccess FALSE -
dc.subject.keywordAuthor mobile edge computing -
dc.subject.keywordAuthor mobile vision application -
dc.subject.keywordAuthor quality of experience -
dc.subject.keywordAuthor DNN model partitioning -
dc.subject.keywordAuthor deep learning -
dc.citation.endPage 3316 -
dc.citation.number 6 -
dc.citation.startPage 3300 -
dc.citation.title IEEE Transactions on Services Computing -
dc.citation.volume 17 -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.relation.journalResearchArea Computer Science -
dc.relation.journalWebOfScienceCategory Computer Science, Information Systems; Computer Science, Software Engineering -
dc.type.docType Article -
Show Simple Item Record

File Downloads

  • There are no files associated with this item.

공유

qrcode
공유하기

Related Researcher

곽정호
Kwak, Jeongho곽정호

Department of Electrical Engineering and Computer Science

read more

Total Views & Downloads