Abstract:With the progress of modern science and technology,the development of image editing tools has greatly reduced the cost of tampering.There are many methods for image tampering,and the existing methods often have the problem of poor universality.Meanwhile,these methods only focus on tampering location and ignore the classification of tampering means.This paper proposes a two-stage network model based on improved Mask R-CNN for image tampering forensics. In the feature extraction part,the input image is preprocessed with spatial rich model (SRM) and constrained convolution,and then input into the first four layers of ResNet101,so as to establish a unified feature representation that can effectively reflect various tampering traces.The first-stage network detects the tampering area through the attention region proposal network (A-RPN),and the prediction module realizes the classification of tampering operation and the location of rough tampering area.Then,the location information obtained by the first-stage network guides the second-stage network to learn local features to locate the final tampering area.The proposed model can detect three different types of image tampering operations,including copy-paste,splicing and removal.The experimental results show that the F1 values of the proposed method in NIST16,COVERAGE,Columbia and CASIA datasets reach 0.924,0.761,0.791 and 0.473 respectively, which is superior to traditional methods and some state-of-the-art deep learning methods.