Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Out-of-order execution microprocessor with reduced store collision load replay reduction

a microprocessor and store technology, applied in the field of out-of-order execution microprocessors, can solve the problems of large penalty for processing load instruction, large replay cost, and load instruction to receive incorrect data, so as to reduce the likelihood, reduce the likelihood, and reduce the likelihood

Inactive Publication Date: 2010-12-02
VIA TECH INC
View PDF28 Cites 50 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0011]In one aspect, the present invention provides an out-of-order execution microprocessor for reducing the likelihood of having to replay a load instruction due to a store collision. The microprocessor includes a queue of entries, each entry configured to hold an instruction pointer of a load instruction and to hold information useable to identify a store instruction that caused the load instruction to be replayed on a first instance of the load instruction. The microprocessor also includes a register alias table (RAT), coupled to the queue of entries, that is configured to encounter instructions in program order and to generate dependencies used to determine when the instructions may execute out of program order. The RAT is configured to encounter the load instruction on a second instance, to determine that the load instruction second instance instruction pointer matches the instruction pointer of an entry of the queue, and to cause the load instruction on the second instance to have a dependency on the store instruction identified by the information in the matching entry.
[0012]In another aspect, the present invention provides a method for reducing the likelihood of having to replay a load instruction in an out-of-order execution microprocessor due to a store collision, the microprocessor having a register alias table (RAT) configured to encounter instructions in program order and to generate dependencies used to determine when the instructions may execute out of program order. The method includes allocating an entry of a queue of entries, in response to replay of a load instruction on a first instance. The method also includes populating the allocated entry to hold an instruction pointer of the load instruction and to hold information useable to identify a store instruction that caused the load instruction to be replayed on the first instance. The method also includes determining, in response to the RAT encountering the load instruction on a second instance, that the load instruction second instance instruction pointer matches the instruction pointer of an entry of the queue. The method also includes causing the load instruction on the second instance to have a dependency on the store instruction identified by the information in the matching entry.
[0013]In another aspect, the present invention provides a computer program product for use with a computing device, the computer program product comprising a computer usable storage medium, having computer readable program code embodied in the medium, for specifying a an out-of-order execution microprocessor for reducing the likelihood of having to replay a load instruction due to a store collision. The computer readable program code includes first program code for specifying a queue of entries, each entry configured to hold an instruction pointer of a load instruction and to hold information useable to identify a store instruction that caused the load instruction to be replayed on a first instance of the load instruction. The computer readable program code also includes second program code for specifying a register alias table (RAT), coupled to the queue of entries. The RAT is configured to encounter instructions in program order and to generate dependencies used to determine when the instructions may execute out of program order. The RAT is also configured to encounter the load instruction on a second instance, to determine that the load instruction second instance instruction pointer matches the instruction pointer of an entry of the queue, and to cause the load instruction on the second instance to have a dependency on the store instruction identified by the information in the matching entry.

Problems solved by technology

This can be problematic in the context of a store collision because the load instruction may be issued for execution before the older store instruction, thereby causing the load instruction to receive incorrect data.
However, replays can be relatively expensive, particularly in microprocessors that are deeply pipelined.
First, the store instruction may be dependent on other instructions—indeed, the store instruction may be at the end of a long chain of dependencies—such that it may not execute for potentially many clock cycles; thus, the load instruction must wait potentially many clock cycles before it can be replayed.
The larger the number of clock cycles that the load instruction must wait to be replayed, the larger the penalty to process the load instruction.
The larger the number of pipeline stages that the load instruction must pass back through, the larger the penalty in terms of number of clock cycles to process the load instruction.
Thus, a potential disadvantage of the color bits array is that it may require a significant amount of storage space on the microprocessor since the number of entries of the instruction cache is typically relatively large.
A relatively large color bits array may consume significant amounts of power and real estate space of the microprocessor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Out-of-order execution microprocessor with reduced store collision load replay reduction
  • Out-of-order execution microprocessor with reduced store collision load replay reduction
  • Out-of-order execution microprocessor with reduced store collision load replay reduction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025]Described herein are embodiments of a pipelined out-of-order execution microprocessor that reduces the number of load instruction replays in the presence of store collisions. The microprocessor includes an enhanced register alias table (RAT) that predicts when a load instruction is involved in a store collision and causes the load instruction to be dependent upon an additional instruction that the load instruction would not normally be dependent upon. The additional instruction upon which the RAT makes the load instruction dependent is referred to herein as the dependee instruction. The additional, or enhanced, dependency causes the issue logic of the microprocessor to wait to issue the load instruction until the dependee instruction has executed, i.e., has produced its result, so that the dependee instruction result can be forwarded to the load instruction or read from the data cache. Consequently, when the issue logic does issue the load instruction for execution, the load i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

An out-of-order execution microprocessor for reducing the likelihood of having to replay a load instruction due to a store collision. The microprocessor includes a queue of entries, each entry configured to hold an instruction pointer of a load instruction and to hold information useable to identify a store instruction that caused the load instruction to be replayed on a first instance of the load instruction. A register alias table (RAT) encounters instructions in program order and generates dependencies used to determine when the instructions may execute out of program order. The RAT encounters the load instruction on a second instance, determines that the load instruction second instance instruction pointer matches the instruction pointer of an entry of the queue, and causes the load instruction on the second instance to have a dependency on the store instruction identified by the information in the matching entry.

Description

CROSS REFERENCE TO RELATED APPLICATIONS[0001]This application claims priority based on U.S. Provisional Application Ser. No. 61 / 182,283, filed May 29, 2009, entitled OUT-OF-ORDER EXECUTION MICROPROCESSOR WITH REDUCED STORE COLLISION LOAD REPLAY REDUCTION, which is hereby incorporated by reference in its entirety.[0002]This application is related to the following co-pending U.S. patent applications which are concurrently filed herewith, and which have a common assignee and common inventors, each of which is incorporated by reference herein for all purposes.Serial NumberTitle(CNTR.2354)OUT-OF-ORDER EXECUTION MICROPROCESSORWITH REDUCED STORE COLLISION LOADREPLAY REDUCTION(CNTR.2486)OUT-OF-ORDER EXECUTION MICROPROCESSORWITH REDUCED STORE COLLISION LOADREPLAY REDUCTIONFIELD OF THE INVENTION[0003]The present invention relates in general to out-of-order execution microprocessors, and more particularly to the performance of memory load instructions therein.BACKGROUND OF THE INVENTION[0004]M...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/30
CPCG06F9/3834G06F9/3838G06F9/3861
Inventor DAY, MATTHEW DANIELHOOKER, RODNEY E.
Owner VIA TECH INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products