Implementation method of vector aggregation loading instruction

An implementation method and instruction technology, which is applied in the field of realization of vector aggregation load instructions, can solve the duplication of micro-operation management functions of vector buffer components and launch queue functions, and the inability of parallel execution of multiple vector aggregation load instructions, and the inability to simultaneously support multiple Issues such as vector aggregation loading instructions, to achieve the effect of reducing the number of micro-operations, reducing Perm operations, and improving program performance

Active Publication Date: 2020-03-24
NAT UNIV OF DEFENSE TECH
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

After splitting in this way, in addition to the Load operation for data acquisition, three Perm operations for data conversion are added, which increases the power consumption of instruction execution and the delay of instruction execution, resulting in performance degradation.
In addition, it is also necessary to increase the perm operation of data conversion in hardware design
[0005] The Chinese patent document with the application number 201810668398.4 discloses a launch method and device for mixed execution of scalar and vector instructions. This method can be used for launch and management of vector aggregate load instructions, but this method has the following three disadvantages: 1) The micro-operation management function of the vector buffer component and the launch queue function are duplicated, resulting in waste of resources
2) Since the vector buffer component has micro-operation management logic, the component resources are relatively large, which is limited by the hardware resources of the entire chip. The vector buffer component cannot support multiple vector aggregate load instructions at the same time, resulting in multiple vector aggregate load instructions that cannot be executed in parallel. , program performance is limited
3) Since there are instructions in the launch queue and the vector buffer unit to be sent to the pipeline, additional arbitration unit support is required, and the order management between instructions becomes complicated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Implementation method of vector aggregation loading instruction
  • Implementation method of vector aggregation loading instruction
  • Implementation method of vector aggregation loading instruction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] Such as image 3 As shown, the implementation steps of the implementation method of the vector aggregation load instruction in this embodiment include:

[0035] 1) Split the vector aggregate loading instruction into multiple common loading micro-operations according to the size of the vector elements;

[0036] 2) Send the split normal load micro-operation to the instruction queue;

[0037] 3) Waiting for the source operand of the common load micro-operation to be ready in the instruction queue, and launching the corresponding common load micro-operation to the storage pipeline after the source operand is ready and carrying the vector element number;

[0038] 4) Execute common loading micro-operations for a single element;

[0039] 5) Judging whether the execution is successful, if the execution is successful, jump to the next step; otherwise, the launch queue needs to choose an opportunity to re-transmit the memory access operation of the element to the pipeline to ob...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of microprocessor design, in particular to a method for realizing a vector aggregation loading instruction, which comprises the following steps of: splitting the vector aggregation loading instruction into a plurality of single-element common loading microoperations; sending the split micro-operation and the corresponding element serial number to an instruction queue; after the operands are prepared, sending the single-element loading microoperation to a storage pipeline to obtain data; writing the obtained data into corresponding elements of the corresponding data cache items; and after all element data of the data cache item is written in, writing result data to a result bus from the data cache, and finishing the execution of the vector aggregation loading instruction. The method can effectively improve the execution performance of the vector aggregation loading instruction, can utilize the path of a common loading instruction to the maximum extent, is suitable for a high-performance out-of-order superscalar microprocessor, and has the advantages of being simple to implement and high in performance.

Description

technical field [0001] The invention relates to the technical field of microprocessor design, in particular to a method for realizing a vector aggregation load instruction. Background technique [0002] In order to adapt to the development of application programs and improve the efficiency of program execution, a variety of vector extensions have been added to mainstream instruction sets. Taking full advantage of the parallelism of vector operations can improve system performance. In vector extension, there is a type of vector aggregate load instruction (Gather Load, denoted as GLoad), which is very different from ordinary load instructions. Such as figure 1 As shown in (a), the ordinary load instruction loads the data of a continuous address in the storage space into the register. And for vector aggregate load instructions, such as figure 1 As shown in (b), the address of each element of the vector is different. This instruction needs to fetch an element from multiple i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/30G06F9/38
CPCG06F9/30094G06F9/3867G06F9/30007
Inventor 郑重王永文孙彩霞王俊辉隋兵才倪晓强雷国庆黄立波郭维郭辉
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products