DGIST Scholar: Re-VoxelDet: Rethinking Neck and Head Architectures for High-Performance Voxel-based 3D Detection

Division of Automotive Technology 2. Conference Papers

Cited time in webofscience

Cited time in scopus

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, Jae-Keun	-
dc.contributor.author	Lee, Jin-Hee	-
dc.contributor.author	Lee, Joohyun	-
dc.contributor.author	Kwon, Soon	-
dc.contributor.author	Jung, Heechul	-
dc.date.accessioned	2024-01-22T19:10:11Z	-
dc.date.available	2024-01-22T19:10:11Z	-
dc.date.created	2024-01-02	-
dc.date.issued	2024-01-06	-
dc.identifier.issn	2642-9381	-
dc.identifier.uri	http://hdl.handle.net/20.500.11750/47643	-
dc.description.abstract	Currently, widely employed LiDAR-based 3D object detectors adopt grid-based approaches to efficiently handle sparse point clouds. However, during this process, the down-sampled features inevitably lose spatial information, which can hinder the detectors from accurately predicting the location and size of objects. To address this issue, previous researches proposed sophisticatedly designed neck and head modules to effectively compensate for information loss. Inspired by the core insights of previous studies, we propose a novel voxel-based 3D object detector, named as Re-VoxelDet, which combines three distinct components to achieve both good detection capability and real-time performance. First, in order to learn features from diverse perspectives without additional computational costs during inference, we introduce Multi-view Voxel Backbone (MVBackbone). Second, to effectively compensate for abundant spatial and strong semantic information, we design Hierarchical Voxel-guided Auxiliary Neck (HVANeck), which attentively integrate hierarchically generated voxel-wise features with RPN blocks. Third, we present Rotation-based Group Head (RGHead), a simple yet effective head module that is designed with two groups according to the heading direction and aspect ratio of the objects. Through extensive experiments on the Argoverse2, nuScenes, and Waymo Open Dataset, we demonstrate the effectiveness of our approach. Our results significantly outperform existing state-of-the-art methods. We plan to release our model and code in the near future.	-
dc.language	English	-
dc.publisher	Computer Vision Foundation, IEEE Computer Society	-
dc.title	Re-VoxelDet: Rethinking Neck and Head Architectures for High-Performance Voxel-based 3D Detection	-
dc.type	Conference Paper	-
dc.identifier.bibliographicCitation	IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024), pp.7503 - 7512	-
dc.identifier.url	https://openaccess.thecvf.com/content/WACV2024/html/Lee_Re-VoxelDet_Rethinking_Neck_and_Head_Architectures_for_High-Performance_Voxel-Based_3D_WACV_2024_paper.html	-
dc.citation.conferencePlace	US	-
dc.citation.conferencePlace	Hawaii	-
dc.citation.endPage	7512	-
dc.citation.startPage	7503	-
dc.citation.title	IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)	-

Files in This Item:: There are no files associated with this item.

Appears in Collections:: Division of Automotive Technology 2. Conference Papers

Show Simple Item Record

qrcode

DGIST

DGIST Scholar was built with support from the OAK distribution project by the National Library of Korea.

You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Library Services Team, DGIST 333. Techno Jungang-daero, Hyeonpung-myeon, Dalseong-gun, Daegu, 42988, Republic of Korea.

RSS_1.0 RSS_2.0 ATOM_1.0

DGIST Library Repository

BROWSE

DGIST

BROWSE