Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Fetch director employing barrel-incrementer-based round-robin apparatus for use in multithreading microprocessor

a multi-threading microprocessor and barrel-incrementer technology, applied in the direction of computation using denominational number representation, multi-programming arrangements, instruments, etc., can solve the problems of affecting the performance of the multi-threading microprocessor

Active Publication Date: 2009-02-10
ARM FINANCE OVERSEAS LTD
View PDF91 Cites 34 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

"The present invention provides an apparatus and method for selecting one of N fetch addresses associated with N corresponding threads for providing to an instruction cache for fetching instructions therefrom in a multithreading microprocessor that concurrently executes the N threads. The apparatus includes a first input for receiving a first corresponding N-bit value specifying which of the N threads was last selected to fetch instructions, a second input for receiving a second corresponding N-bit value, and a barrel incrementer and combinational logic for adding the second value to a 1-bit left-rotated version of the first value to generate a sum and a carry-out bit. The method includes receiving the first and second inputs and adding the second value to a 1-bit left-rotated version of the first value to generate a third corresponding N-bit value. The technical effect of the invention is that it provides a more efficient and scalable way to select which threads should be fetched in a multithreading microprocessor, based on the number of threads requesting to fetch instructions."

Problems solved by technology

However, the performance improvement that may be achieved through exploitation of instruction-level parallelism is limited.
One example of a performance-constraining issue addressed by multithreading microprocessors is the fact that accesses to memory outside the microprocessor that must be performed due to a cache miss typically have a relatively long latency.
Consequently, some or all of the pipeline stages of a single-threaded microprocessor may be idle performing no useful work for many clock cycles.
Other examples of performance-constraining issues addressed by multithreading microprocessors are pipeline stalls and their accompanying idle cycles due to a data dependence; or due to a long latency instruction such as a divide instruction, floating-point instruction, or the like; or due to a limited hardware resource conflict.
However, the circuitry to implement a round-robin arbitration scheme in which only a variable subset of the requestors may be requesting the resource each time the resource becomes available is more complex.
Thus, as the number of requesters—such as the number of threads in a multithreading microprocessor—becomes relatively large, the size of the conventional circuit may become burdensome on the processor in terms of size and power consumption, particularly if more than one such circuit is needed in the processor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fetch director employing barrel-incrementer-based round-robin apparatus for use in multithreading microprocessor
  • Fetch director employing barrel-incrementer-based round-robin apparatus for use in multithreading microprocessor
  • Fetch director employing barrel-incrementer-based round-robin apparatus for use in multithreading microprocessor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046]Referring now to FIG. 1, a block diagram illustrating a pipelined multithreading microprocessor 100 according to the present invention is shown. The microprocessor 100 is configured to concurrently execute a plurality of threads. A thread—also referred to herein as a thread of execution, or instruction stream—comprises a sequence, or stream, of program instructions. The threads may be from different programs executing on the microprocessor 100, or may be instruction streams from different parts of the same program executing on the microprocessor 100, or a combination thereof.

[0047]Each thread has an associated thread context (TC). A thread context comprises a collection of storage elements, such as registers or latches, and / or bits in the storage elements of the microprocessor 100 that describe the state of execution of a thread. That is, the thread context describes the state of its respective thread, which is unique to the thread, rather than state shared with other threads ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A fetch director in a multithreaded microprocessor that concurrently executes instructions of N threads is disclosed. The N threads request to fetch instructions from an instruction cache. In a given selection cycle, some of the threads may not be requesting to fetch instructions. The fetch director includes a circuit for selecting one of threads in a round-robin fashion to provide its fetch address to the instruction cache. The circuit adds a first addend to a 1-bit left-rotated version of a second addend to generate a sum and a carry-out bit. The circuit includes the carry-out bit as a carry-in bit of the add to generate the sum. The sum is ANDed with the inverse of the first addend to generate a 1-hot vector indicating which of the threads is selected next. The first addend is an N-bit vector where each bit is false if the corresponding thread is requesting to fetch instructions from the instruction cache. The second addend is a 1-hot vector indicating the last selected thread. In one embodiment threads with an empty instruction buffer are selected at highest priority; a last dispatched but not fetched thread at middle priority; all other threads at lowest priority. The threads are selected round-robin within the highest and lowest priorities.

Description

CROSS REFERENCE TO RELATED APPLICATIONS[0001]This application is related to the following co-pending Non-Provisional U.S. Patent Applications, which are hereby incorporated by reference in their entirety for all purposes:[0002]Ser. No.(Docket No.)Filing DateTitle11 / 051997Feb. 4, 2005BIFURCATED THREAD(MIPS.0199-00-US)SCHEDULER IN AMULTITHREADINGMICROPROCESSOR11 / 051980Feb. 4, 2005LEAKY-BUCKET THREAD(MIPS.0200-00-US)SCHEDULER IN AMULTITHREADINGMICROPROCESSOR11 / 051979Feb. 4, 2005MULTITHREADING(MIPS.0201-00-US)MICROPROCESSOR WITHOPTIMIZED THREADSCHEDULER FORINCREASING PIPELINEUTILIZATION EFFICIENCY11 / 051998Feb. 4, 2005MULTITHREADING PROCESSOR(MIPS.0201-01-US)INCLUDING THREADSCHEDULER BASED ONINSTRUCTION STALLLIKELIHOOD PREDICTION11 / 051978Feb. 4, 2005INSTRUCTION / SKID BUFFERS(MIPS.0202-00-US)IN A MULTITHREADINGMICROPROCESSOR[0003]This application is related to and filed concurrently with the following Non-Provisional U.S. Patent Applications, each of which is incorporated by reference in i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(United States)
IPC IPC(8): G06F9/50
CPCG06F9/3802G06F9/3814G06F9/3851G06F9/3888
Inventor JENSEN, MICHAEL GOTTLIEBBANERJEE, SOUMYA
Owner ARM FINANCE OVERSEAS LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products