Streamlined convolution computing architecture design method and residual network acceleration system

A technology of computing architecture and design method, applied in the direction of neural architecture, neural learning method, biological neural network model, etc., can solve the problems of high hardware resource utilization, unrealizable, and inability to fully utilize parallelism, etc., to reduce memory access delay , enhance parallelism, and avoid the effect of multiple accesses to external memory

Active Publication Date: 2021-05-28
SUN YAT SEN UNIV
View PDF7 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the need to access an external memory before and after the calculation of each layer of convolution calculation, and the residual network usually has a deep network layer, it brings a lot of energy consumption and memory access delay; due to the residual Due to the particularity of the network structure, using a single convolution processing array can only serially execute the convolution layer connected by the main road of the residual building block and the branch shortcut, and then perform point-by-point addition, which cannot make full use of the parallelism of its structure; At the same time, the residual network convolution layer has various sizes, and using a single convolution processing array to process convolutions of different sizes usually cannot achieve high hardware resource utilization.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Streamlined convolution computing architecture design method and residual network acceleration system
  • Streamlined convolution computing architecture design method and residual network acceleration system
  • Streamlined convolution computing architecture design method and residual network acceleration system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] Such as image 3 As shown, a design method of a pipelined convolution computing architecture includes the following steps:

[0037] S1: The hardware acceleration architecture is divided into an on-chip buffer, a convolution processing array, and a point-by-point addition module;

[0038] S2: The main route of the hardware acceleration architecture is composed of three serially arranged convolution processing arrays, and two pipeline buffers are inserted between them to realize the interlayer pipeline of the three-layer convolution of the main route. The pipeline buffer is set in the on-chip buffer;

[0039] S3: Set the fourth convolution processing array for parallel processing of the convolutional layer with a kernel size of 1×1 for the branch of the residual building block, and change its working mode by configuring the registers in the fourth convolution processing array to make it available For calculating the convolutional layer or fully connected layer at the he...

Embodiment 2

[0045] Such as Figure 5 As shown, a residual network accelerator system is designed using the pipelined convolution computing architecture design method, including: a direct memory access module, a pipelined convolution computing architecture module, a pooling operation unit, and a global control logic unit;

[0046] The direct memory access module sends the read data command to the off-chip memory, thereby transferring the data in the off-chip memory to the on-chip input buffer; sends the write data command to the off-chip memory, and calculates the final result obtained by the current residual building block The output feature writes data from the output buffer back to external memory;

[0047] The pooling operation unit is used to perform the average pooling operation or the maximum pooling operation; when the pooling operation needs to be performed, the pooling operation unit will read the feature data from the output buffer of the pipeline convolution computing architect...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a streamlined convolution computing architecture design method and a residual network acceleration system. According to the method, a hardware acceleration architecture is divided into an on-chip buffer area, a convolution processing array and a point-by-point addition module; a main path of the hardware acceleration architecture is composed of three convolution processing arrays which are arranged in series, and two assembly line buffer areas are inserted among the three convolution processing arrays and used for achieving interlayer assembly lines of three layers of convolution of the main path. A fourth convolution processing array is set to be used for processing convolution layers, with the kernel size being 1 * 1, of the branches of the residual building blocks in parallel, a register in the fourth convolution processing array is configured, the working mode of the fourth convolution processing array is changed, the fourth convolution processing array can be used for calculating a residual network head convolution layer or a full connection layer, and when the branches of the residual building blocks are not convolved, the fourth convolution processing array is skipped out and convolution is not exected; and a point-by-point addition module is set to add corresponding output feature pixels element by element for the output feature of the main path of the residual building block and the output feature of the branch quick connection.

Description

technical field [0001] The present invention relates to the field of computer vision scene processing methods, and more specifically, to a method for designing a pipeline convolution computing architecture and a residual network acceleration system. Background technique [0002] Convolutional neural networks (CNNs) are widely used in various computer vision scenarios and have shown superior performance. However, due to complex and intensive computing requirements and huge storage requirements, it is a challenge to deploy and accelerate convolutional neural networks on mobile devices and embedded platforms that are sensitive to power consumption and require high real-time performance. [0003] In the convolutional neural network, the calculation time of the convolutional layer occupies more than 90% of the total calculation time of the network. Therefore, the acceleration of the convolutional layer operation is the most important part of the acceleration of the convolutional ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063G06N3/08
CPCG06N3/063G06N3/08G06N3/045Y02D10/00
Inventor 黄以华黄俊源陈志炜
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products