CUDA multi-thread processing method, system and related equipment

A processing method and multi-threading technology, applied in the computer field, can solve the problems of increasing the execution delay of kernel functions and affecting the efficiency of parallel processing in CUDA, and achieve the effects of saving hardware costs, improving efficiency, and saving time and overhead

Active Publication Date: 2022-05-31
AZURENGINE TECH ZHUHAI INC
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Before the kernel function executes the thread, it usually needs to generate the index of the thread. In the case of low-complexity hardware, it takes a lot of clock cycles to generate all the indexes in a thread block. The execution delay becomes larger, which will affect the efficiency of parallel processing in CUDA

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • CUDA multi-thread processing method, system and related equipment
  • CUDA multi-thread processing method, system and related equipment
  • CUDA multi-thread processing method, system and related equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0100] In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only The embodiments are part of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the scope of protection of the present application.

[0101] The appearances of the terms "comprising" and "having" and any variations thereof in the specification, claims and drawings of this application are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed step...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application provides a CUDA multi-thread processing method, system and related equipment, wherein the method includes: obtaining the configuration information corresponding to the kernel function; there is no historical configuration information of the target matching the configuration information in the historical configuration information Next, the three-dimensional index of the thread is generated according to the configuration information; the generated three-dimensional index is compressed and packaged according to the configuration information, and the compressed and packaged three-dimensional index is stored in the memory; when the target historical configuration information exists in the historical configuration information, A historical three-dimensional index corresponding to the target historical configuration information is obtained; the historical three-dimensional index is compressed and packaged according to the target historical configuration information, and the compressed and packaged historical three-dimensional index is stored in a memory. The embodiments of the present application are beneficial to improving the efficiency of multi-thread parallel processing in CUDA.

Description

technical field [0001] The present application relates to the field of computer technology, and in particular, to a CUDA multi-thread processing method, system and related equipment. Background technique [0002] CUDA (Compute Unified Device Architecture) is a computing platform launched by graphics card manufacturer NVIDIA, which uses C language as a programming language to provide a large number of high-performance computing instruction development capabilities. The calculation in CUDA is inseparable from the kernel function (kernel) and the thread (thread), a kernel function corresponds to a thread grid (grid), a thread grid contains several thread blocks (thread blocks), and a thread block contains several thread. Before the kernel function executes the thread, it usually needs to generate the index of the thread. In the case of low-complexity hardware, it takes a lot of clock cycles to generate all the indexes in a thread block. The execution delay becomes larger, whi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/38G06T1/20
CPCG06F9/3851G06T1/20Y02D10/00
Inventor 雷宇李原朱建斌付尧永田敏雄
Owner AZURENGINE TECH ZHUHAI INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products