Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Method for accelerating operating speed of graphics processing unit (GPU) through dead code removal

A technology of running speed and dead code, applied in program control design, instrumentation, electrical digital data processing, etc., can solve the problem of low execution efficiency of GPU core programs, shorten compilation and optimization time, reduce assembly code size, reduce The effect of code size

Inactive Publication Date: 2013-04-17
NAT UNIV OF DEFENSE TECH
View PDF2 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] The technical problem to be solved by the present invention is: to solve the problem of low execution efficiency of large-scale GPU core programs, on the premise of ensuring the correctness of the programs, a method for accelerating the running speed of GPUs by removing dead codes is proposed to improve large-scale Execution and compilation efficiency of GPU core programs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for accelerating operating speed of graphics processing unit (GPU) through dead code removal
  • Method for accelerating operating speed of graphics processing unit (GPU) through dead code removal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] figure 1 Detects table structures for dead code status. The specific table structure creation method is as follows:

[0045] The number of entries in the state detection table is the number of functions in the GPU core program. The state detection table contains six fields in total, namely: function number ID, function name Name, call mark Callee, static analysis mark Static, dynamic execution mark Dynamic and delete mark Del. The function number ID is the globally unique mark of the function, the function name Name represents the name of the function; the call mark Callee indicates whether the function is called by the program, the call mark Callee is true, indicating that the function has been called by the program, and Callee is false, indicating that the function has not been called by the program ; Static analysis mark Static means to judge whether it will be executed after static analysis of the function module, Static is true means that the function may run whe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for accelerating operating speed of a GPU through dead code removal. By the aid of the method, the implementation and compiling efficiency of a large-scale GPU kernel program can be improved. The technical scheme includes that firstly, a state detection table is established for all functions in the large-scale GPU kernel program; basic information of functions is recorded, and the state detection table is initialized; static analysis is conducted for the GPU program; and then the GPU kernel program is operated, information during operation of the GPU kernel program is recorded, states of all function detection table fields in the state detection table are updated, dead codes are marked and finally certified, and dead codes are deleted according to a dead code set D which is obtained finally. According to the method, dead codes which are not implemented during operation are removed, so that the code size of the GPU kernel program is reduced, the assembly code size which is generated finally is also reduced, the hit rate of single instruction multiple data (SIMD) instruction scheduling in the GPU can be improved, and the operation efficiency of the large-scale GPU kernel program can be greatly improved.

Description

technical field [0001] The invention relates to a method for accelerating the running speed of a large-scale GPU core program, in particular to a method for accelerating the running speed of the GPU by removing dead codes. Background technique [0002] GPU (Graphics Processing Unit, Graphics Processing Unit) was usually used in the field of graphics and image applications in the past, and is now widely used to accelerate various general-purpose parallel algorithms and applications. These algorithms and the core programs applied on the GPU are usually relatively simple, usually only hundreds of lines of code. However, for some large-scale applications with practical application value, such as the non-deterministic particle transport program MCNP (Monte Carlo N-particle, N-particle Monte Carlo method), the core code implemented by the GPU is usually tens of thousands of lines, and at the same time There are a lot of dead codes when a specific program is executed. GPUs have s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/45
Inventor 迟利华刘杰胡庆丰晏益慧龚春叶甘新标徐涵蒋杰杨博
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products