Abstract:In response to the low detection accuracy caused by small target size, blurry edges, and complex background interference in remote sensing images, this paper proposes an improved remote sensing small object detection algorithm based on RT-DETR (real-time detection transformer). First, a spatially-enhanced feedforward network (SEFN) encoder is designed to establish spatial-semantic associations, thereby strengthening the response of the target regions and their edges. Second, a hypergraph modulation fusion module (HyperMFM) neck network is designed to model global contextual relationships through incorporating hypergraph computation, and achieve efficient adaptive feature fusion through modulating feature fusion. Finally, a Focaler-ShapeIoU loss function is constructed to focus on small target samples and their intrinsic shape characteristics. Experimental results on the SkyFusion dataset demonstrate that the improved model achieves improvements of 7.29% in mAP0.5 and 5.52% in mAP0.5:0.95 over the baseline model, effectively enhancing the detection accuracy of small targets in remote sensing images.