PolarFusion: A multi-modal fusion algorithm for 3D object detection based on polar coordinates

Neural Netw. 2025 Jun 18:190:107704. doi: 10.1016/j.neunet.2025.107704. Online ahead of print.

Abstract

Existing 3D object detection algorithms that fuse multi-modal sensor information typically operate in Cartesian coordinates, which can lead to asymmetrical feature information and uneven attention across multiple views. To address this, we propose PolarFusion, the first multi-modal fusion BEV object detection algorithm based on polar coordinates. We designed three specialized modules for this approach: the Polar Region Candidates Generation Module, the Polar Region Query Generation Module, and the Polar Region Information Fusion Module. In the Polar Region Candidates Generation Module, we use a region proposal-based segmentation method to remove irrelevant areas from images, enhancing PolarFusion's information processing efficiency. These segmented image regions are then integrated into the point cloud segmentation task, addressing feature misalignment during fusion. The Polar Region Query Generation Module leverages prior information to generate high-quality target queries, reducing the time spent learning from initialization. For the Polar Region Information Fusion Module, PolarFusion employs a simple yet efficient self-attention to merge internal information from images and point clouds. This captures long-range dependencies in image texture information while preserving the precise positional data from point clouds, enabling more accurate BEV object detection. We conducted extensive experiments on challenging BEV object detection datasets. Both qualitative and quantitative results demonstrate that PolarFusion achieves an NDS of 76.1% and mAP of 74.5% on the nuScenes test set, significantly outperforming Cartesian-based methods. This advancement enhances the environmental perception capabilities of autonomous vehicles and contributes to the development of future intelligent transportation systems. The code will be released at https://github.com/RunshuaiGe/PolarFusion.git.

Keywords: 3D object detection; Automotive driving; BEV; Environmental perception; Polar coordinates.