• N. Filatov
  • R. Potekhin
Advancements in 3D object detection are pivotal for the development of autonomous driving technologies, demanding high accuracy, robustness, and real-time processing capabilities. Current state-of-the-art multi-modal 3d object detection frameworks often struggle to balance these demands, particularly under the computational constraints of autonomous vehicles. This study introduces a novel 3D object detection framework that leverages a transformer-based fusion module, employing unique radial and zigzag partitioning techniques to efficiently integrate LiDAR and camera data. Our method, termed CTP-net, is designed to optimize inference speed while maintaining competitive detection accuracy. Tested on the NuScenes validation dataset, CTP-net achieves a NuScenes Detection Score (NDS) of 68.39. Notably, it demonstrates remarkable inference speeds of 8.50 FPS on an NVIDIA RTX 3060 and 20.72 FPS on a Tesla A100, indicating substantial improvements over existing methods making it a viable solution for deployment on edge devices with limited computational resources. © 2025 Elsevier B.V., All rights reserved.
Original languageEnglish
Title of host publicationAdvances in Neural Computation, Machine Learning, and Cognitive Research VIII
Subtitle of host publicationSelected Papers from the XXVI International Conference on Neuroinformatics
Pages426-437
Number of pages12
DOIs
StatePublished - 2025
EventXXVI International Conference on Neuroinformatics - Долгопрудный, Russian Federation
Duration: 21 Oct 202425 Oct 2024

Publication series

NameStudies in Computational Intelligence
Volume1179 SCI

Conference

ConferenceXXVI International Conference on Neuroinformatics
Country/TerritoryRussian Federation
CityДолгопрудный
Period21/10/2425/10/24

    Research areas

  • Autonomous driving, Multi-modal 3d object detection, Sensor Fusion, Transformers

ID: 143217188