Detail View

DC Field Value Language
dc.contributor.author Hwang, Kyumin -
dc.contributor.author Choi, Wonhyeok -
dc.contributor.author Han, Kiljoon -
dc.contributor.author Choi, Wonjoon -
dc.contributor.author Choi, Minwoo -
dc.contributor.author Na, Yongcheon -
dc.contributor.author Park, Minwoo -
dc.contributor.author Im, Sunghoon -
dc.date.accessioned 2026-02-10T00:40:16Z -
dc.date.available 2026-02-10T00:40:16Z -
dc.date.created 2025-12-04 -
dc.date.issued 2026-01 -
dc.identifier.uri https://scholar.dgist.ac.kr/handle/20.500.11750/60002 -
dc.description.abstract Recent foundation models demonstrate strong generalization capabilities in monocular depth estimation. However, directly applying these models to Full Surround Monocular Depth Estimation (FSMDE) presents two major challenges: (1) high computational cost, which limits realtime performance, and (2) difficulty in estimating metricscale depth, as these models are typically trained to predict only relative depth. To address these limitations, we propose a novel knowledge distillation strategy that transfers robust depth knowledge from a foundation model to a lightweight FSMDE network. Our approach leverages a hybrid regression framework combining the knowledge distillation scheme–traditionally used in classification–with a depth binning module to enhance scale consistency. Specifically, we introduce a crossinteraction knowledge distillation scheme that distills the scaleinvariant depth bin probabilities of a foundation model into the student network while guiding it to infer metric-scale depth bin centers from ground-truth depth. Furthermore, we propose view-relational knowledge distillation, which encodes structural relationships among adjacent camera views and transfers them to enhance cross-view depth consistency. Experiments on DDAD and nuScenes demonstrate the effectiveness of our method compared to conventional supervised methods and existing knowledge distillation approaches. Moreover, our method achieves a favorable trade-off between performance and efficiency, meeting real-time requirements. © 2016 IEEE. -
dc.language English -
dc.publisher Institute of Electrical and Electronics Engineers -
dc.title Scale-Invariant and View-Relational Representation Learning for Full Surround Monocular Depth -
dc.type Article -
dc.identifier.doi 10.1109/LRA.2025.3635451 -
dc.identifier.wosid 001631846400001 -
dc.identifier.scopusid 2-s2.0-105022656899 -
dc.identifier.bibliographicCitation IEEE Robotics and Automation Letters, v.11, no.1, pp.1002 - 1009 -
dc.description.isOpenAccess FALSE -
dc.subject.keywordAuthor Full surround depth -
dc.subject.keywordAuthor knowledge distillation -
dc.subject.keywordAuthor lightweight -
dc.subject.keywordAuthor monocular depth -
dc.subject.keywordAuthor representation learning -
dc.citation.endPage 1009 -
dc.citation.number 1 -
dc.citation.startPage 1002 -
dc.citation.title IEEE Robotics and Automation Letters -
dc.citation.volume 11 -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.relation.journalResearchArea Robotics -
dc.relation.journalWebOfScienceCategory Robotics -
dc.type.docType Article -
Show Simple Item Record

File Downloads

  • There are no files associated with this item.

공유

qrcode
공유하기

Related Researcher

임성훈
Im, Sunghoon임성훈

Department of Electrical Engineering and Computer Science

read more

Total Views & Downloads

???jsp.display-item.statistics.view???: , ???jsp.display-item.statistics.download???: