Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Object detection method and system based on multi-scale feature map reconstruction and knowledge distillation

A multi-scale feature and target detection technology, applied in the field of target detection in computer vision, can solve problems such as optimization and high time complexity of YOLOv3, and achieve the effect of reducing model operation speed and increasing accuracy

Active Publication Date: 2022-07-26
NANJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Low-resolution, semantically strong features are upsampled and combined with high-resolution, semantically weak features to build a feature pyramid that shares rich semantics at all levels, but it still has a lot of room for improvement, For example, [Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. Path aggregation network for instance segmentation. In CVPR, 2018] manually designed the fusion structure and enhanced feature fusion to improve the detection accuracy a lot, but these algorithms It has not been optimized in combination with YOLOv3 and the actual scene, and there is still a lot of room for improvement in feature map reconstruction
[0005] For the model compression method of target detection, many previous works have been proposed to compress large CNNs or directly learn more effective CNN models for fast reasoning, such as literature [E.L.Denton, W.Zaremba, J.Bruna, Y.LeCun, and R.Fergus.Exploiting linear structure within convolutional networks efficient evaluation.In NIPS,2014.] Applied low-rank approximation, literature [S.Han,J. Pool,J.Tran,and W.Dally.Learning both weights and connections for efficient neural network.In NIPS, pages 1135–1143,2015.] The weight pruning used, etc., but most of these techniques require specially designed software / hardware accelerators to accelerate execution, aiming at the target detection model on embedded devices There are relatively few compression methods. The current compression algorithm compresses YOLOv3 with high time complexity, and cannot well complete the target detection tasks in embedded device application scenarios (such as pedestrian and vehicle target detection in intelligent transportation)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Object detection method and system based on multi-scale feature map reconstruction and knowledge distillation
  • Object detection method and system based on multi-scale feature map reconstruction and knowledge distillation
  • Object detection method and system based on multi-scale feature map reconstruction and knowledge distillation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] Below in conjunction with accompanying drawing, technical scheme of the present invention is described in detail:

[0068] like figure 1 As shown, a kind of target detection method based on multi-scale feature map reconstruction and knowledge distillation disclosed in the embodiment of the present invention, taking pedestrian and vehicle detection as an example, utilizes target detection algorithm YOLOv3 [Redmon J, Farh adiA.Yolov3:An incremental The improvement[J].arXiv preprint arXiv:1804.027 67,2018] extracts features from the CityStreet urban street view dataset provided by City University of Hong Kong, generates a multi-scale feature map, and then compresses the feature map along the spatial dimension to compress the features of each feature map along the spatial dimension. A two-dimensional feature channel is compressed into a real number with a global receptive field, and the output dimension matches the number of input feature channels. By modeling, a weight is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a target detection method and system based on multi-scale feature map reconstruction and knowledge distillation. The method first uses the backbone network Darknet-53 to extract features, and the deep features generate multi-scale features through upsampling and shallow feature tensor splicing. Feature map; then use the feature re-calibration strategy to automatically obtain the weight of each channel in the feature map, promote useful features and suppress useless features according to the weight, and then use the residual module to fuse the semantic information of the top-level features and the details of the underlying features; Then, the γ coefficient of the batch normalization layer in the backbone network is introduced into the pruning objective function for training, and the channel where the γ coefficient below the threshold is located is removed from the model according to the pruning threshold; finally, the trained YOLOv3 benchmark model is used as the teacher. network, the pruned model is used as a student network for knowledge distillation. The invention improves the accuracy of detecting objects of different sizes in a large range, reduces the calculation amount of the model, and improves the model detection speed.

Description

technical field [0001] The invention provides a target detection method and system based on multi-scale feature map reconstruction and knowledge distillation, belonging to the technical field of target detection of computer vision. Background technique [0002] Image target recognition is a research topic involving computer vision, pattern recognition and artificial intelligence. With the rapid development of hardware technology, embedded intelligent devices based on deep learning platforms are becoming more and more mature, and more and more detection algorithms are embedded However, the traditional detection method has a large difference in detection accuracy for targets of different sizes within a certain range, cannot accurately identify the target, and cannot meet daily needs, and the traditional detection algorithm model has too many parameters, which requires more computing power. Therefore, it is necessary to propose a technology that can not only make the detection ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06V10/75G06V10/774G06V10/764G06V10/80G06V10/82G06K9/62G06N3/04G06N3/08
CPCG06N3/084G06V10/751G06V2201/07G06N3/044G06N3/045G06F18/241G06F18/253G06F18/214
Inventor 刘天亮平安戴修斌邹玉龙
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products