Matrix multiplication acceleration method for general multi-core dsp

A matrix multiplication and matrix technology, applied in the field of matrix multiplication acceleration oriented to general multi-core DSP structure, can solve problems such as difficult to meet processor DSP

Active Publication Date: 2017-03-15
NAT UNIV OF DEFENSE TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Due to the difference in processor design goals and instruction set structures, the traditional matrix multiplication implementation technology for general-purpose processors is difficult to meet the performance requirements of DSP (Digital Signal Processing, Digital Signal Processing) processors designed for specific applications. Therefore, Efficient matrix multiplication must be customized for the DSP architecture to improve the operation speed of matrix multiplication and meet the processor design goals to the greatest extent

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Matrix multiplication acceleration method for general multi-core dsp
  • Matrix multiplication acceleration method for general multi-core dsp
  • Matrix multiplication acceleration method for general multi-core dsp

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0118] figure 1 It is a general multi-core DSP architecture;

[0119] in figure 1 Among them, each single-core DSP is composed of SPU and VPU. The SPU is composed of L1I (Level 1 Instruction) Cache, L1D (Level 1 Data) Cache, SPE (Special Processing Unit), scalar register and flow controller. L1I is used for instruction cache; L1D is used for data caching; SPE is used for some instruction flow control, configuration of vector units and main communication tasks; VPU includes AM (Array Memory), vector registers and multiple concurrently executable VPE (Virtual Processing Unit), AM is mainly used for array buffering, and the vector SIMD unit composed of multiple VPEs is mainly used for numerical operation acceleration.

[0120] figure 2 It is the overall flow chart of the matrix multiplication acceleration method for general multi-core DSP of the present invention

[0121] The steps of the present invention are as follows:

[0122] The first step is DSP configuration and initialization...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a matrix multiplication accelerating method oriented to general multi-core DSP (Digital Signal Processing), and aims to increase the calculating speed of matrix multiplication and maximize the calculating efficiency of the general multi-core DSP. The technical scheme is as follows: firstly, configuring and initializing the DSP; then dividing a matrix A and a matrix B and converting the original matrix multiplication into block matrix multiplication according to the topological structure mg x ng of a VPU (Virtual Processing Unit); next, enabling each VPU to execute data migration operation synchronously in parallel; finally, incorporating into the result of an AM (Array Memory) in each VPU to form a calculation result of a result matrix C=A x B according to a data distribution principle. Through adoption of the matrix multiplication accelerating method oriented to general multi-core DSP, disclosed by the invention, the matrix multiplication speed of a general multi-core DSP structure and a calculation resource utilization rate of a general multi-core DSP system can be increased.

Description

Technical field [0001] The invention relates to a matrix multiplication acceleration method, in particular to a matrix multiplication acceleration method oriented to a general multi-core DSP structure. Background technique [0002] With the continuous increase of general-purpose DSP computing performance and the widespread application of general-purpose DSP, general-purpose multi-core DSP will surely become an important development direction of high-performance computing. Matrix multiplication is the most commonly used type of operation in numerical calculations. Many applications include the calculation process of matrix multiplication. Designing efficient matrix multiplication methods for general-purpose multi-core DSPs can effectively increase the calculation speed of applications and improve the calculation efficiency of general-purpose multi-core DSPs. In order to achieve the design goal of general multi-core DSP. [0003] Matrix multiplication is to multiply one row of the m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/16
Inventor 迟利华刘杰甘新标晏益慧徐涵胡庆丰蒋杰李胜国王庆林皇甫永硕崔显涛周陈
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products