Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Branch prediction accuracy in a processor that supports speculative execution

a technology of speculative execution and branch prediction, applied in the direction of instruments, digital computers, computation using denominational number representation, etc., can solve the problems of large waiting time of microprocessor systems, unable the inability to meet the demand of a large fraction of time, so as to improve the branch prediction accuracy

Inactive Publication Date: 2006-07-27
SUN MICROSYSTEMS INC
View PDF0 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0015] One embodiment of the present invention provides a system which improves branch prediction accuracy in a processor that supports speculative-execution. During normal-execution mode, the system issues instructions in program order. Upon encountering a launch condition which causes a processor to enter a speculative-execution mode, the system performs a checkpoint and begins executing instructions in a speculative-execution mode. Upon encountering a branch instruction during speculative-execution mode, the system selects the subsequent instruction to be executed based on a current state of a branch predictor and updates the branch prediction only from weakly-not-taken to weakly-taken or from weakly-taken to weakly-not-taken during speculative-execution mode. Note that updating the branch predictor in this fashion prevents the branch predictor from being incorrectly updated twice when re-executing the branch instruction after returning to normal-execution mode.
[0023] One embodiment of the present invention provides a system which improves branch prediction accuracy in a processor that supports speculative-execution. During normal-execution mode, the system issues instructions in program order. Upon encountering a launch condition which causes a processor to enter a speculative-execution mode, the system performs a checkpoint and begins executing instructions in a speculative-execution mode. Upon encountering a branch instruction during speculative-execution mode, the system selects the subsequent instruction to be executed based on a current state of a branch predictor and leaves the branch predictor in the current state. Note that leaving the branch predictor in the current state prevents the branch predictor from being incorrectly updated twice when re-executing the branch instruction after returning to normal-execution mode.

Problems solved by technology

Hence, the disparity between microprocessor clock speeds and memory access speeds continues to grow, and is beginning to create significant performance problems.
This means that the microprocessor systems spend a large fraction of time waiting for memory references to complete instead of performing computational operations.
When a memory reference, such as a load operation, generates a cache miss, the subsequent access to level-two (L2) cache (or memory) can require dozens or hundreds of clock cycles to complete, during which time the processor is typically idle, performing no useful work.
Unfortunately, existing out-of-order designs have a hardware complexity that grows quadratically with the size of the issue queue.
Practically speaking, this constraint limits the number of entries in the issue queue to one or two hundred, which is not sufficient to hide memory latencies as processors continue to get faster.
Moreover, constraints on the number of physical registers that can be used for register renaming purposes during out-of-order execution also limit the effective size of the issue queue.
Unfortunately certain operations, such as branch instructions, can be adversely affected by speculative-execution.
During speculative-execution, a problem can arise when the processor updates the branch prediction mechanism once during speculative-execution and then incorrectly updates the branch prediction a second time upon resuming normal-execution.
This duplication of updates to the branch prediction can cause the processor to subsequently mispredict the branch, thereby causing considerable performance degradation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Branch prediction accuracy in a processor that supports speculative execution
  • Branch prediction accuracy in a processor that supports speculative execution
  • Branch prediction accuracy in a processor that supports speculative execution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present invention. Thus, the present invention is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.

Branch Predictor

[0036] In one embodiment of the present invention, the branch prediction associated with each branch is recorded in a 2-bit field. Consequently, there are 4 states possible for the branch predictor. The four states (along with their associated bit patterns) are:

Strongly-not-taken (SNT 600){00}Weakly-not-taken (WNT 601){01}...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

One embodiment of the present invention provides a system which improves branch prediction accuracy in a processor that supports speculative-execution. During normal-execution mode, the system issues instructions in program order. Upon encountering a launch condition which causes a processor to enter a speculative-execution mode, the system performs a checkpoint and begins executing instructions in a speculative-execution mode. Upon encountering a branch instruction during speculative-execution mode, the system selects the subsequent instruction to be executed based on a current state of a branch predictor and does not update the current state of the branch predictor, thereby preventing the branch predictor from being incorrectly updated twice when re-executing the branch instruction after returning to normal-execution mode.

Description

BACKGROUND [0001] 1. Field of the Invention [0002] The present invention relates to techniques for improving the performance of computer systems. More specifically, the present invention relates to a method and apparatus for improving branch prediction accuracy in a processor that supports speculative execution. [0003] 2. Related Art [0004] Advances in semiconductor fabrication technology have given rise to dramatic increases in microprocessor clock speeds. This increase in microprocessor clock speeds has not been matched by a corresponding increase in memory access speeds. Hence, the disparity between microprocessor clock speeds and memory access speeds continues to grow, and is beginning to create significant performance problems. Execution profiles for fast microprocessor systems show that a large fraction of execution time is spent not within the microprocessor core, but within memory structures outside of the microprocessor core. This means that the microprocessor systems spend...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F9/00
CPCG06F9/3842G06F9/3844
Inventor CAPRIOLI, PAULYIP, SHERMAN H.CHAUDHRY, SHAILENDER
Owner SUN MICROSYSTEMS INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products