Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Vectorization implementation method of valid convolution of convolutional neural network

A technology of convolutional neural network and implementation method, which is applied in the field of vectorized implementation of convolutional neural network Valid convolution, can solve problems such as the impact of loading data efficiency, wasted storage bandwidth, mismatched number of processing units, etc., so as to improve the overall computing power. Efficiency, taking into account efficiency and accuracy, avoiding the effect of the summation of specifications

Active Publication Date: 2022-03-18
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] According to the architectural characteristics of vector processors, there are currently various vectorization implementation methods for convolution calculations. For example, a vectorization method for convolutional neural network operations of vector processors disclosed in Chinese patent application 201810687639.X, and patent application 201810689646.3. A GPDSP-oriented convolutional neural network multi-core parallel computing method, patent application 201710201589.5 discloses a vectorized implementation method of two-dimensional matrix convolution for vector processors, etc., but such schemes all use weight data Loading into the vector array memory AM, loading the input image feature data into the scalar storage SM of the vector array memory to complete the convolution calculation, and most of them use the third-dimensional order to reorder the data, and the Valid convolution calculation The amount is large, and there is no vectorization implementation method for Valid convolution in convolutional neural network. When the above traditional scheme is applied to the vectorization implementation of Valid convolution, there will be the following problems:
[0006] 1. The weight data cannot be effectively shared, which will waste storage bandwidth and fail to give full play to the computing efficiency of the vector processor
[0007] 2. Since the size of the third dimension is uncertain and does not match the number of processing units of the vector processor, and the size of the third dimension of different convolutional neural network models and different convolutional layers is different, the loading data efficiency of the above-mentioned various schemes will be large. Affected and not universal

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vectorization implementation method of valid convolution of convolutional neural network
  • Vectorization implementation method of valid convolution of convolutional neural network
  • Vectorization implementation method of valid convolution of convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The present invention will be further described below in conjunction with the accompanying drawings and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0049] Such as figure 2 As shown, the steps of the vectorized implementation method of the convolutional neural network Valid convolution in this embodiment include:

[0050] Step 1: Store the input feature data set data used for convolutional neural network calculation in a sample dimension-first manner, that is, the input feature data set data is continuously stored in the off-chip memory of the vector processor according to an N*M order matrix, where M is the total number of samples in the data set; N=preH*preW*preC is the number of input features of a single sample; and the data of the convolution kernel is stored in a priority manner according to the number of convolution kernels;

[0051] Step 2: the vector processor divides the input feature data set da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for realizing vectorized convolution of a convolutional neural network Valid convolution. The steps include: Step 1: storing the input feature data set data in a sample dimension-first manner, and storing the convolution kernel data according to the convolution kernel Store in a quantity-dimension-first manner; Step 2: Divide the input feature data set data matrix into multiple matrix blocks by column; Step 3: Transfer the convolution kernel data matrix to the SM of each kernel each time, and transfer the data from the input feature In the data matrix, the sub-matrix composed of K rows of data extracted by row is transmitted to the AM of each core; Step 4: Perform vectorized matrix multiplication calculation and parallelized matrix multiplication calculation; Step 5: Store the output feature matrix calculation result in the vector processing In the off-chip memory of the device; Step 6: Repeat steps 4 and 5 until the calculation of all input feature data matrices is completed. The invention has the advantages of simple implementation method, high execution efficiency and precision, and small bandwidth requirement.

Description

technical field [0001] The invention relates to a vector processor, in particular to a method for realizing vectorization of Valid convolution of a convolutional neural network. Background technique [0002] In recent years, deep learning models based on deep convolutional neural networks have made remarkable achievements in image recognition and classification, target detection, video analysis, etc. The rapid development of related technologies such as data processing and processors. Convolutional Neural Networks (CNN) is a type of Feedforward Neural Networks (Feedforward Neural Networks) that includes convolution calculations and has a deep structure. It is one of the representative algorithms for deep learning. The input layer of the convolutional neural network can process multi-dimensional data. Since the convolutional neural network is the most widely used in the field of computer vision, when designing the convolutional neural network structure, three-dimensional inp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/04G06N3/063G06N3/08G06F15/80
CPCG06N3/063G06N3/08G06F15/8007G06N3/045
Inventor 刘仲郭阳邓林田希扈啸陈海燕孙书为马媛曹坤吴立
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products