DGIST Scholar: Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection

Detail View

Department of Electrical Engineering and Computer Science Cyber-Physical Security Lab 2. Conference Papers

Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection

Citations

WEB OF SCIENCE

Citations

SCOPUS

Metadata Downloads

XML

Excel

DC Field	Value	Language
dc.contributor.author	Cheng, Zhiyuan	-
dc.contributor.author	Choi, Hongjun	-
dc.contributor.author	Feng, Shiwei	-
dc.contributor.author	Liang, James	-
dc.contributor.author	Tao, Guanhong	-
dc.contributor.author	Liu, Dongfang	-
dc.contributor.author	Zuzak, Michael	-
dc.contributor.author	Zhang, Xiangyu	-
dc.date.accessioned	2025-02-03T21:40:15Z	-
dc.date.available	2025-02-03T21:40:15Z	-
dc.date.created	2024-08-16	-
dc.date.issued	2024-05-07	-
dc.identifier.uri	http://hdl.handle.net/20.500.11750/57852	-
dc.description.abstract	Multi-sensor fusion (MSF) is widely used in autonomous vehicles (AVs) for perception, particularly for 3D object detection with camera and LiDAR sensors. The purpose of fusion is to capitalize on the advantages of each modality while minimizing its weaknesses. Advanced deep neural network (DNN)-based fusion techniques have demonstrated the exceptional and industry-leading performance. Due to the redundant information in multiple modalities, MSF is also recognized as a general defence strategy against adversarial attacks. In this paper, we attack fusion models from the camera modality that is considered to be of lesser importance in fusion but is more affordable for attackers. We argue that the weakest link of fusion models depends on their most vulnerable modality, and propose an attack framework that targets advanced camera-LiDAR fusion-based 3D object detection models through camera-only adversarial attacks. Our approach employs a two-stage optimization-based strategy that first thoroughly evaluates vulnerable image areas under adversarial attacks, and then applies dedicated attack strategies for different fusion models to generate deployable patches. The evaluations with six advanced camera-LiDAR fusion models and one camera-only model indicate that our attacks successfully compromise all of them. Our approach can either decrease the mean average precision (mAP) of detection performance from 0.824 to 0.353, or degrade the detection score of a target object from 0.728 to 0.156, demonstrating the efficacy of our proposed attack framework. Code is available. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.	-
dc.language	English	-
dc.publisher	International Conference on Learning Representations, ICLR	-
dc.relation.ispartof	12th International Conference on Learning Representations, ICLR 2024	-
dc.title	Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection	-
dc.type	Conference Paper	-
dc.identifier.doi	10.48550/arXiv.2304.14614	-
dc.identifier.scopusid	2-s2.0-85200605885	-
dc.identifier.bibliographicCitation	Cheng, Zhiyuan. (2024-05-07). Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection. International Conference on Learning Representations (poster), 1–25. doi: 10.48550/arXiv.2304.14614	-
dc.identifier.url	https://iclr.cc/virtual/2024/poster/19509	-
dc.citation.conferenceDate	2024-05-07	-
dc.citation.conferencePlace	AU	-
dc.citation.conferencePlace	Vienna	-
dc.citation.endPage	25	-
dc.citation.startPage	1	-
dc.citation.title	International Conference on Learning Representations (poster)	-

Show Simple Item Record

File Downloads

There are no files associated with this item.

Detail View

Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection

File Downloads

공유

Total Views & Downloads