Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

CNN reasoning acceleration system, acceleration method and medium

An acceleration system and multiplication-accumulation technology, applied in the field of CNN reasoning acceleration system, can solve the problems of inability to use software ecology, lack of flexibility, x86 and ARM cannot be customized and expanded, and achieve efficient and convenient acceleration and strong flexibility Effect

Active Publication Date: 2021-04-16
SUZHOU LANGCHAO INTELLIGENT TECH CO LTD
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention mainly solves the problem that the existing ASIC to realize CNN acceleration will lead to insufficient flexibility, cannot utilize the existing software ecology, and x86 and ARM cannot be customized and extended question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • CNN reasoning acceleration system, acceleration method and medium
  • CNN reasoning acceleration system, acceleration method and medium
  • CNN reasoning acceleration system, acceleration method and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0062] See figure 1 , the present embodiment provides a CNN reasoning acceleration system, including: an instruction storage module, an instruction fetch module, a decoding module, an instruction dispatch module, a data storage module, an IMC instruction module, a vector instruction module, and a vector register module;

[0063] The instruction storage module, the instruction fetch module, the decoding module, the instruction dispatch module, the data storage module, the IMC instruction module and the vector instruction module are connected by the AXCI bus, and interact through the VALID / READY handshake mechanism;

[0064] The instruction storage module stores instructions, and the data storage module stores all data generated when the system is running, all of which are realized by on-chip SRAM or cache memory. The interactive interface of the instruction storage module adopts the AXI interface, and is connected to the AXI bus through the AXI interface;

[0065] When the scal...

Embodiment 2

[0126] See image 3 , based on the same inventive concept as the CNN inference acceleration system in the foregoing embodiments, the embodiment of this specification also provides an acceleration method for the CNN inference acceleration system, including:

[0127] S10, the instruction fetching module reads the instruction stored in the instruction storage module, and generates an access address of the instruction through the address generation module in the instruction fetching module, and sends the instruction to the decoding module;

[0128] S11, after the decoding module receives the instruction, it parses the instruction, and the parsed information includes the type of the instruction, the operand of the instruction, and the information for controlling the execution of the instruction, and sends the parsed information to the instruction dispatching module;

[0129] S12. After receiving the parsed information, the instruction dispatch module reads the state in the vector i...

Embodiment 3

[0132] Based on the same inventive concept as the CNN inference acceleration system in the foregoing embodiments, the embodiment of this specification also provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor according to the above-mentioned one Acceleration method steps of the CNN inference acceleration system.

[0133] The serial numbers of the embodiments disclosed in the above-mentioned embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a CNN reasoning acceleration system which comprises an instruction operation module, a data storage module, an IMC instruction module, a vector instruction module and a vector register module. The instruction operation module stores the instruction, decodes and analyzes the instruction, and sends the analyzed instruction to the IMC instruction module and the vector instruction module; the data storage module stores system data; the IMC instruction module receives the analyzed instruction sent by the instruction module and executes image preprocessing, activation processing and pooling processing; the vector instruction module is used for executing the vector instruction and writing an execution result of the vector instruction into the vector register module; the vector register module stores a result of executing the vector instruction; the CNN acceleration requirement can be met, the method has the advantages of being open, modular and extensible, in addition, secondary development can be conducted on the aspect of software to construct a complete software tool chain, and therefore the personalized requirement of a user is met.

Description

technical field [0001] The invention relates to the field of CNN reasoning acceleration, in particular to a CNN reasoning acceleration system, acceleration method and medium. Background technique [0002] The GPU (Graphics Processing Unit, Graphics Processing Unit) single-instruction-stream-multiple-data-stream structure supports vector operations well, and can be used to accelerate CNN (Convolutional Neural Networks). However, GPU is not specially designed for CNN acceleration, and the energy efficiency of running CNN algorithm is low. [0003] ASIC (Application Specific Integrated Circuit) is a customized chip to meet specific requirements. The customized features help to improve the performance-to-power ratio. Compared with GPU, CNN acceleration based on ASIC has obvious energy efficiency advantages. However, if an instruction-free approach is used when designing an ASIC, it will result in insufficient flexibility and cannot take advantage of the existing software ecosys...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/30G06N3/063G06N5/04
Inventor 杨继林
Owner SUZHOU LANGCHAO INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products