多尺度特征增强与边界感知的小目标检测网络设计
DOI:
CSTR:
作者:
作者单位:

1.郑州工商学院 信息工程学院;2.海军工程大学;3.泰源工程集团股份有限公司

作者简介:

通讯作者:

中图分类号:

基金项目:

开封市社会科学届联合会2025年度哲学社会科学规划调研项目


Design of small object detection network with multi-scale feature enhancement and boundary awareness
Author:
Affiliation:

1.School of Information Engineering,Zhengzhou Technology and Business University,Zhengzhou;2.Naval Engineering University;3.Taiyuan Engineering Group CoLtd,Zhengzhou

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对无人机航拍图像中小目标检测面临的尺度变化大、特征表达弱、背景干扰复杂及定位精度不足等问题,提出一种多尺度特征增强与边界感知网络。首先,设计多尺度增强注意力模块(Multi-Scale Spatial Attention Augmentation Module, MSAAM),引导网络自适应聚焦小目标区域,提升特征判别能力;其次,提出多尺度空间融合增强模块(Multi-Scale Spatial Fusion Enhancement Module, MSFEM),采用并行多尺度分支与高效通道注意力机制,增强多尺度特征表达;最后引入Inner-IoU(Intersection over Union)损失函数,强化边界内聚区域感知,提高定位精度。在VisDrone2019数据集上,所提MSFE-ILNet(Multi-Scale Feature Enhancement and Inner-IoU Learning Network)的精确率、召回率与mAP@50分别达到56.5%、44.4%和47.3%,优于现有主流模型。在RSOD数据集上的评估验证了该方法良好的泛化能力,为无人机航拍小目标检测提供了有效解决方案。

    Abstract:

    Aiming at the challenges of small object detection in unmanned aerial vehicle (UAV) aerial images, including large-scale variations, weak feature representations, complex background interference, and insufficient localization accuracy, a multi-scale feature enhancement and boundary-aware network is proposed. First, a Multi-Scale Spatial Attention Augmentation Module (MSAAM) is designed to guide the network to adaptively focus on small object regions and improve feature discriminability. Second, a Multi-Scale Spatial Fusion Enhancement Module (MSFEM) is proposed, which employs parallel multi-scale branches and an efficient channel attention mechanism to enhance multi-scale feature representation. Finally, the Inner-IoU (Intersection over Union) loss function is introduced to strengthen the perception of cohesive regions within small object boundaries and improve localization accuracy. Experimental results on the VisDrone2019 dataset demonstrate that the proposed MSFE-ILNet (Multi-Scale Feature Enhancement and Inner-IoU Learning Network) achieves precision, recall, and mAP@50 of 56.5%, 44.4%, and 47.3%, respectively, significantly outperforming existing mainstream detection models. Further evaluation on the RSOD dataset validates the good generalization capability of the proposed method, providing an effective solution for small object detection in UAV aerial photography.

    参考文献
    相似文献
    引证文献
引用本文
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-12-17
  • 最后修改日期:2026-02-02
  • 录用日期:2026-02-08
  • 在线发布日期:
  • 出版日期:
文章二维码