The invention relates to the field of artificial intelligence security, in particular to a black box aggressive defense system based on neural network interlayer regularization, which comprises a first source model, a second source model and a third source model, a black box aggressive defense method based on neural network interlayer regularization comprises the steps of S1, inputting a picture into a first source model for white box attack and outputting a first adversarial sample sequence, S2, inputting the first adversarial sample sequence into a second source model, outputting a second adversarial sample sequence, and S3, outputting a second adversarial sample sequence. and S3, inputting the second adversarial sample sequence into a third source model for black box attack, and outputting a third identification sample sequence, S4, inputting the third identification sample sequence into the third source model for adversarial training, and updating the third source model. An adversarial sample generated by using the algorithm has the characteristic of high mobility to a target model, and the target model can also be effectively defended from being attacked through adversarial training.