CNN reasoning acceleration system, acceleration method and medium
An acceleration system and multiplication-accumulation technology, applied in the field of CNN reasoning acceleration system, can solve the problems of inability to use software ecology, lack of flexibility, x86 and ARM cannot be customized and expanded, and achieve efficient and convenient acceleration and strong flexibility Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0062] See figure 1 , the present embodiment provides a CNN reasoning acceleration system, including: an instruction storage module, an instruction fetch module, a decoding module, an instruction dispatch module, a data storage module, an IMC instruction module, a vector instruction module, and a vector register module;
[0063] The instruction storage module, the instruction fetch module, the decoding module, the instruction dispatch module, the data storage module, the IMC instruction module and the vector instruction module are connected by the AXCI bus, and interact through the VALID / READY handshake mechanism;
[0064] The instruction storage module stores instructions, and the data storage module stores all data generated when the system is running, all of which are realized by on-chip SRAM or cache memory. The interactive interface of the instruction storage module adopts the AXI interface, and is connected to the AXI bus through the AXI interface;
[0065] When the scal...
Embodiment 2
[0126] See image 3 , based on the same inventive concept as the CNN inference acceleration system in the foregoing embodiments, the embodiment of this specification also provides an acceleration method for the CNN inference acceleration system, including:
[0127] S10, the instruction fetching module reads the instruction stored in the instruction storage module, and generates an access address of the instruction through the address generation module in the instruction fetching module, and sends the instruction to the decoding module;
[0128] S11, after the decoding module receives the instruction, it parses the instruction, and the parsed information includes the type of the instruction, the operand of the instruction, and the information for controlling the execution of the instruction, and sends the parsed information to the instruction dispatching module;
[0129] S12. After receiving the parsed information, the instruction dispatch module reads the state in the vector i...
Embodiment 3
[0132] Based on the same inventive concept as the CNN inference acceleration system in the foregoing embodiments, the embodiment of this specification also provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor according to the above-mentioned one Acceleration method steps of the CNN inference acceleration system.
[0133] The serial numbers of the embodiments disclosed in the above-mentioned embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com