Abstract:To address the low sensitivity to subtle geometric features, vulnerability to noise and occlusion, and limited real-time capability of conventional point cloud registration methods in industrial environments, this paper presents a multimodal 3D point cloud registration approach that fuses image-based detection with geometric descriptors. First, an improved Real-Time Detection Transformer (RT-DETR) is used for workpiece detection and region of interest (ROI) segmentation. The corresponding point cloud regions are then preprocessed using voxel grid filtering for downsampling to reduce computational cost. Second, we propose a multi-scale curvature-weighted descriptor, Fast Point Feature Histogram with Multi-Scale Curvature Weighting (FPFH-MSW), and employ a bidirectional consistency search strategy to obtain robust global coarse registration. Finally, an adaptive robust loss, the Adaptive Penalty Error Hybrid (APEH) loss, is developed to fine-tune alignment and enhance robustness to noise and occlusion, thereby refining alignment accuracy. Extensive experiments demonstrate that the proposed method achieves fast and accurate detection and registration of a variety of scattered and stacked workpieces, outperforming conventional algorithms in both registration accuracy and robustness. The method thus offers an efficient and reliable solution for automated part identification and high-precision localization in complex industrial scenarios.