Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Technique for grouping instructions into independent strands

a technology of independent strands and instructions, applied in the field of multi-threaded programming, can solve the problems of inability to efficiently utilize limited hardware resources, inability to perform useful work of hardware resources, etc., and achieve the effects of reducing idle cycles, increasing processing throughput of that processing core, and increasing the energy efficiency of the processing cor

Active Publication Date: 2015-02-12
NVIDIA CORP
View PDF10 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The patent text claims that the invention improves the performance of a processor by making it more efficient and productive. This means that the processor can work faster and use less energy compared to older processors.

Problems solved by technology

Problems arise with the approach described above when the multithreaded software program involves long-latency instructions, such as load instructions or texture fetch operations.
Consequently, during the time spent waiting for the load instruction to complete, the hardware resource cannot perform any useful work.
In short, the execution of conventional multithreaded software programs fails to efficiently utilize limited hardware resources.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Technique for grouping instructions into independent strands
  • Technique for grouping instructions into independent strands
  • Technique for grouping instructions into independent strands

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018]In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the present invention may be practiced without one or more of these specific details.

System Overview

[0019]FIG. 1 is a block diagram illustrating a computer system 100 configured to implement one or more aspects of the present invention. Computer system 100 includes a central processing unit (CPU) 102 and a system memory 104 communicating via an interconnection path that may include a memory bridge 105. System memory 104 includes an image of an operating system 130, a driver 103, and a co-processor enabled application 134. Operating system 130 provides detailed instructions for managing and coordinating the operation of computer system 100. Driver 103 provides detailed instructions for managing and coordinating operation of parallel processing subsystem 112 and one or more parallel p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A device compiler and linker is configured to group instructions into different strands for execution by different threads based on the dependence of those instructions on other, long-latency instructions. A thread may execute a strand that includes long-latency instructions, and then hardware resources previously allocated for the execution of that thread may be de-allocated from the thread and re-allocated to another thread. The other thread may then execute another strand while the long-latency instructions are in flight. With this approach, the other thread is not required to wait for the long-latency instructions to complete before acquiring hardware resources and initiating execution of the other strand, thereby eliminating at least a portion of the time that the other thread would otherwise spend waiting.

Description

BACKGROUND OF THE INVENTION[0001]1. Field of the Invention[0002]The present invention generally relates to multithreaded programming and, more specifically, to a technique grouping instructions into independent strands.[0003]2. Description of the Related Art[0004]In a multithreaded processing paradigm, a processing unit may execute multiple threads. Those threads may share a hardware resource in order to execute different portions of a multithreaded software program. For example, a first thread could execute using the hardware resource to implement a first portion of the multithreaded program while a second thread waits for access to the hardware resource. When the first thread completes execution, the second thread could then execute a second portion of the multithreaded program using the hardware resource. The hardware resource could be, for example, an execution unit, an arithmetic logic unit, a processing core, or any such hardware resource.[0005]Problems arise with the approach...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F9/30
CPCG06F9/30145G06F9/3851G06F8/45G06F9/3888G06F8/41G06F8/433
Inventor MEHRARA, MOJTABAGARLAND, MICHAELDIAMOS, GREGORY
Owner NVIDIA CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products