Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A universal convolutional neural network accelerator based on a one-dimensional pulsation array

A technology of convolutional neural network and systolic array, which is applied in the field of electronic information and deep learning, can solve the problems of large amount of parameters and calculation, and achieve the effects of improving operating efficiency, reducing communication waiting time, and improving computing efficiency

Active Publication Date: 2019-06-25
SOUTHEAST UNIV +1
View PDF5 Cites 48 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the amount of parameters and calculations required for high-performance convolutional neural networks is also very large. For example, for tasks such as detection / recognition / semantic segmentation of high-definition images, the weight data of the model alone is as high as hundreds of megabytes, even for the inference process. It also often requires tens to hundreds of billions of multiplication and accumulation operations. The data access frequency, calculation amount, and storage space requirements all bring great pressure to the computing platform. It is necessary to find a way to build a high-performance general-purpose convolutional neural network. hardware accelerator to solve the above problems, the case arises from this

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A universal convolutional neural network accelerator based on a one-dimensional pulsation array
  • A universal convolutional neural network accelerator based on a one-dimensional pulsation array
  • A universal convolutional neural network accelerator based on a one-dimensional pulsation array

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The technical solutions and beneficial effects of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0025] Such as figure 1 The structure of each module of the accelerator designed by the present invention is shown, and the working method is as follows:

[0026] The off-chip processor sends a mode configuration instruction to the accelerator in advance, and the mode configurator decodes the instruction after receiving the instruction, and accordingly sets the configuration port of each functional module or assigns a value to the configuration register. The configurable part of the data scheduling module includes feature map row length ML, convolution kernel row number KH, convolution kernel column number KL, convolution kernel step size S, convolution kernel number KC, feature map filling number PAD, single channel Calculate the number of rows LC, pool type PT, data update mode DR these configuration registers. When the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a universal convolutional neural network accelerator based on a one-dimensional pulsation array. An AXI4 bus interface is used for realizing loading of a mode configuration instruction, reading of to-be-calculated data and batch sending of result data. The mode configurator configures each functional module as a corresponding working type through the mode configuration instruction; The data scheduling module can concurrently perform tasks of caching data to be calculated, reading calculation data, caching convolution results, processing the convolution results and outputting the convolution results; The convolution calculation module adopts a one-dimensional pulsation array mode to carry out convolution calculation; The to-be-calculated data cache region, the convolution result cache region and the output result buffer FIFO are used for caching corresponding data; And the result processing module carries out common result processing operation in the convolutional neural network. The accelerator can be compatible with different calculation types in a convolutional neural network, high-parallelism calculation is carried out to effectively accelerate, and onlya lower off-chip memory access bandwidth requirement and a small amount of on-chip memory resources are needed.

Description

Technical field [0001] The invention belongs to the technical field of electronic information and deep learning, and in particular relates to a general convolutional neural network hardware accelerator based on a one-dimensional systolic array (1-DSystolic Array). Background technique [0002] In recent years, deep convolutional neural networks have received extensive attention recently. From the Google Brain team’s use of deep neural networks to "recognize cats" in 2012 to the invincibility of AlphaGO / AlphaZero of the Deepmind team on the Go field in 2016 / 17, convolution "Deep learning" represented by neural networks has attracted not only the attention of the public, but also great interest from academia and industry. Through the efforts of researchers and engineers, convolutional neural networks have been widely used in many directions, such as image recognition, target detection, and natural language processing. [0003] However, high-performance convolutional neural networks ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/063G06N3/04
CPCY02D10/00
Inventor 陆生礼庞伟罗几何李宇峰
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products