Standard

Continuous Token Partitioning for Real-Time Multi-modal 3d Object Detection. / Filatov, N.; Potekhin, R.

Advances in Neural Computation, Machine Learning, and Cognitive Research VIII : Selected Papers from the XXVI International Conference on Neuroinformatics. 2025. p. 426-437 (Studies in Computational Intelligence; Vol. 1179 SCI).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Harvard

Filatov, N & Potekhin, R 2025, Continuous Token Partitioning for Real-Time Multi-modal 3d Object Detection. in Advances in Neural Computation, Machine Learning, and Cognitive Research VIII : Selected Papers from the XXVI International Conference on Neuroinformatics. Studies in Computational Intelligence, vol. 1179 SCI, pp. 426-437, XXVI International Conference on Neuroinformatics, Долгопрудный, Russian Federation, 21/10/24. https://doi.org/10.1007/978-3-031-80463-2_40

APA

Filatov, N., & Potekhin, R. (2025). Continuous Token Partitioning for Real-Time Multi-modal 3d Object Detection. In Advances in Neural Computation, Machine Learning, and Cognitive Research VIII : Selected Papers from the XXVI International Conference on Neuroinformatics (pp. 426-437). (Studies in Computational Intelligence; Vol. 1179 SCI). https://doi.org/10.1007/978-3-031-80463-2_40

Vancouver

Filatov N, Potekhin R. Continuous Token Partitioning for Real-Time Multi-modal 3d Object Detection. In Advances in Neural Computation, Machine Learning, and Cognitive Research VIII : Selected Papers from the XXVI International Conference on Neuroinformatics. 2025. p. 426-437. (Studies in Computational Intelligence). https://doi.org/10.1007/978-3-031-80463-2_40

Author

Filatov, N. ; Potekhin, R. / Continuous Token Partitioning for Real-Time Multi-modal 3d Object Detection. Advances in Neural Computation, Machine Learning, and Cognitive Research VIII : Selected Papers from the XXVI International Conference on Neuroinformatics. 2025. pp. 426-437 (Studies in Computational Intelligence).

BibTeX

@inproceedings{91282cece46748159c9e9594699173b7,
title = "Continuous Token Partitioning for Real-Time Multi-modal 3d Object Detection",
abstract = "Advancements in 3D object detection are pivotal for the development of autonomous driving technologies, demanding high accuracy, robustness, and real-time processing capabilities. Current state-of-the-art multi-modal 3d object detection frameworks often struggle to balance these demands, particularly under the computational constraints of autonomous vehicles. This study introduces a novel 3D object detection framework that leverages a transformer-based fusion module, employing unique radial and zigzag partitioning techniques to efficiently integrate LiDAR and camera data. Our method, termed CTP-net, is designed to optimize inference speed while maintaining competitive detection accuracy. Tested on the NuScenes validation dataset, CTP-net achieves a NuScenes Detection Score (NDS) of 68.39. Notably, it demonstrates remarkable inference speeds of 8.50 FPS on an NVIDIA RTX 3060 and 20.72 FPS on a Tesla A100, indicating substantial improvements over existing methods making it a viable solution for deployment on edge devices with limited computational resources. {\textcopyright} 2025 Elsevier B.V., All rights reserved.",
keywords = "Autonomous driving, Multi-modal 3d object detection, Sensor Fusion, Transformers",
author = "N. Filatov and R. Potekhin",
note = "Export Date: 01 November 2025; Cited By: 0; Correspondence Address: N. Filatov; Peter the Great St. Petersburg Polytechnic University, Saint-Petersburg, Polytechnicheskaya, 29, 195251, Russian Federation; email: n.filatov@celsus.ai; Conference name: 26th International Conference on Neuroinformatics, NI 2024; Conference location: Moscow; null ; Conference date: 21-10-2024 Through 25-10-2024",
year = "2025",
doi = "10.1007/978-3-031-80463-2_40",
language = "Английский",
isbn = "9783031804625",
series = "Studies in Computational Intelligence",
pages = "426--437",
booktitle = "Advances in Neural Computation, Machine Learning, and Cognitive Research VIII",

}

RIS

TY - GEN

T1 - Continuous Token Partitioning for Real-Time Multi-modal 3d Object Detection

AU - Filatov, N.

AU - Potekhin, R.

N1 - Export Date: 01 November 2025; Cited By: 0; Correspondence Address: N. Filatov; Peter the Great St. Petersburg Polytechnic University, Saint-Petersburg, Polytechnicheskaya, 29, 195251, Russian Federation; email: n.filatov@celsus.ai; Conference name: 26th International Conference on Neuroinformatics, NI 2024; Conference location: Moscow

PY - 2025

Y1 - 2025

N2 - Advancements in 3D object detection are pivotal for the development of autonomous driving technologies, demanding high accuracy, robustness, and real-time processing capabilities. Current state-of-the-art multi-modal 3d object detection frameworks often struggle to balance these demands, particularly under the computational constraints of autonomous vehicles. This study introduces a novel 3D object detection framework that leverages a transformer-based fusion module, employing unique radial and zigzag partitioning techniques to efficiently integrate LiDAR and camera data. Our method, termed CTP-net, is designed to optimize inference speed while maintaining competitive detection accuracy. Tested on the NuScenes validation dataset, CTP-net achieves a NuScenes Detection Score (NDS) of 68.39. Notably, it demonstrates remarkable inference speeds of 8.50 FPS on an NVIDIA RTX 3060 and 20.72 FPS on a Tesla A100, indicating substantial improvements over existing methods making it a viable solution for deployment on edge devices with limited computational resources. © 2025 Elsevier B.V., All rights reserved.

AB - Advancements in 3D object detection are pivotal for the development of autonomous driving technologies, demanding high accuracy, robustness, and real-time processing capabilities. Current state-of-the-art multi-modal 3d object detection frameworks often struggle to balance these demands, particularly under the computational constraints of autonomous vehicles. This study introduces a novel 3D object detection framework that leverages a transformer-based fusion module, employing unique radial and zigzag partitioning techniques to efficiently integrate LiDAR and camera data. Our method, termed CTP-net, is designed to optimize inference speed while maintaining competitive detection accuracy. Tested on the NuScenes validation dataset, CTP-net achieves a NuScenes Detection Score (NDS) of 68.39. Notably, it demonstrates remarkable inference speeds of 8.50 FPS on an NVIDIA RTX 3060 and 20.72 FPS on a Tesla A100, indicating substantial improvements over existing methods making it a viable solution for deployment on edge devices with limited computational resources. © 2025 Elsevier B.V., All rights reserved.

KW - Autonomous driving

KW - Multi-modal 3d object detection

KW - Sensor Fusion

KW - Transformers

UR - https://www.mendeley.com/catalogue/1f9a3c88-ac74-3b25-835b-b71565a7ae5a/

U2 - 10.1007/978-3-031-80463-2_40

DO - 10.1007/978-3-031-80463-2_40

M3 - статья в сборнике материалов конференции

SN - 9783031804625

T3 - Studies in Computational Intelligence

SP - 426

EP - 437

BT - Advances in Neural Computation, Machine Learning, and Cognitive Research VIII

Y2 - 21 October 2024 through 25 October 2024

ER -

ID: 143217188