Abstract:In view of the detection failure due to the poor semantic information of shallow features, a method of multi-layer feature fusion to improve SSD (single shot multi-box detector) is proposed. First, deep separable convolution is added to the shallow network structure, and the shallow semantics is strengthened by using channel convolution and point convolution, then the deep and shallow networks are refined by Deconvolution and empty convolution; finally, the attention mechanism is added to the deep network to enhance the detection ability of low-resolution features for small-scale objects. The proposed feature refinement mechanism combines the enhanced semantic information after sub-sampling, and expands its resolution through Deconvolution to improve the accuracy of small object detection. Experiments show that the detection accuracy of this method improves by 5.56% and the recall improves by 9.48% on the VOC2007 and2012 data set.